Detecting methylation in a subpopulation of genomic dna

ABSTRACT

This invention provides methods of determining the biological, pathological, genetic, epigenetic or disease status in a biological sample by determining the methylation status of a subpopulation of genomic DNA in the sample.

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims the benefit from U.S. Provisional ApplicationNo. 61/442,918, filed on Feb. 15, 2011, which is hereby incorporatedherein in its entirety for all purposes.

FIELD OF THE INVENTION

The present invention relates to methods for determining the methylationstatus of a subpopulation of genomic DNA in a biological sample.

BACKGROUND OF THE INVENTION

Most DNA in a cell is packaged around a set of histone proteins in acoiled structure known as a nucleosome. Nucleosomes, in turn, arefurther coiled into a highly condensed structure that tightly compactsthe DNA. This combination of DNA and protein packaging is generallyreferred to as chromatin. Chromatin has two forms: euchromatin, aloosely packaged form of chromatin in which the DNA is accessible totranscriptional machinery and is usually, but not always,transcriptionally active, and heterochromatin, a tightly packaged formin which the DNA is inaccessible to transcriptional machinery and isusually, but not always, transcriptionally silent.

The transition between euchromatin and heterochromatin is mainlycontrolled by three epigenetic events, DNA methylation, histonemodification, and RNA interaction. These epigenetic events affectwhether genomic DNA in a cell is in a loosely packaged,transcriptionally active form or a tightly packaged, transcriptionallysilent form.

SUMMARY OF THE INVENTION

The present invention provides methods of detecting a biological,pathological, genetic or epigenetic state of a subpopulation of genomicDNA (gDNA) in a sample. In some embodiments, the methods comprise:

a) dividing a biological sample comprising gDNA into at least a firstportion and a second portion;

b) enriching a subpopulation of gDNA in the first portion; and

c) determining the DNA methylation status at one or more gDNA regions inthe first portion and in the second portion, wherein a difference in theextent of DNA methylation in the subpopulation of gDNA in the firstportion relative to the extent of DNA methylation in the second portionat the one or more gDNA regions indicates or is correlated with thebiological, pathological, genetic or epigenetic state in thesubpopulation of gDNA.

In a related aspect, the invention provides methods of detecting thepresence of cancer in a biological sample. In some embodiments, themethods comprise:

a) dividing a biological sample comprising gDNA, wherein the biologicalsample comprises cells suspected of being cancerous, into at least afirst portion and a second portion;

b) enriching a subpopulation of gDNA in the first portion; and

c) determining the DNA methylation status at one or more gDNA regions inthe first portion and in the second portion, wherein a difference in theextent of DNA methylation in the enriched gDNA in the first portionrelative to the extent of DNA methylation in the gDNA in the secondportion at the one or more gDNA regions indicates or is correlated withthe presence of cancer in the biological sample.

In a related aspect, the invention provides methods of detecting thepresence of cancer in a biological sample. In some embodiments, themethods comprise:

a) dividing a biological sample comprising gDNA, wherein the biologicalsample comprises cells suspected of being cancerous, into at least afirst portion and a second portion;

b) enriching for inaccessible gDNA in the first portion; and

c) determining the DNA methylation status at one or more gDNA regions inthe first portion and in the second portion, wherein a difference in theextent of DNA methylation in the inaccessible gDNA in the first portionrelative to the extent of DNA methylation in the gDNA in the secondportion at the one or more gDNA regions indicates or is correlated withthe presence of cancer in the biological sample. In some embodiments, anincrease in the extent of DNA methylation in the inaccessible gDNA inthe first portion relative to the extent of DNA methylation in the gDNAin the second portion at the one or more gDNA regions indicates or iscorrelated with the presence of cancer in the biological sample.

In a related aspect, the invention provides methods of detecting thepresence of cancer in a biological sample. In some embodiments, themethods comprise:

a) dividing a biological sample comprising gDNA, wherein the biologicalsample comprises cells suspected of being cancerous, into at least afirst portion and a second portion;

b) enriching for accessible gDNA in the first portion; and

c) determining the DNA methylation status at one or more gDNA regions inthe first portion and in the second portion, wherein a difference in theextent of DNA methylation in the accessible gDNA in the first portionrelative to the extent of DNA methylation in the gDNA in the secondportion at the one or more gDNA regions indicates or is correlated withthe presence of cancer in the biological sample. In some embodiments, anincrease in the extent of DNA methylation in the accessible gDNA in thefirst portion relative to the extent of DNA methylation in the gDNA inthe second portion at the one or more gDNA regions indicates or iscorrelated with the presence of cancer in the biological sample.

In a related aspect, the invention provides methods of detecting thepresence of cancer in a biological sample. In some embodiments, themethods comprise:

a) dividing a biological sample comprising gDNA, wherein the biologicalsample comprises cells suspected of being cancerous, into at least afirst portion and a second portion;

b) enriching the first portion by performing chromatinimmunoprecipitation (“ChIP”); and

c) determining the DNA methylation status at one or more gDNA regions inthe first portion and in the second portion, wherein a difference in theextent of DNA methylation in the ChIP-enriched first portion relative tothe extent of DNA methylation in the gDNA in the second portion at theone or more gDNA regions indicates or is correlated with the presence ofcancer in the biological sample. In some embodiments, an increase in theextent of DNA methylation in the ChIP-enriched first portion relative tothe extent of DNA methylation in the gDNA in the second portion at theone or more gDNA regions indicates or is correlated with the presence ofcancer in the biological sample. In various embodiments, ChIP enrichmentcan be for gDNA associated with or bound to histones containing amodification of interest (e.g., trimethylation of lysine 4 of histone 3)or gDNA associated with or bound to a protein of interest (e.g., RNApolymerase II).

In a further aspect, the invention provides methods of determininggenomic imprinting of a preselected gDNA region in a biological sample.In some embodiments, the methods comprise:

a) dividing a biological sample comprising gDNA into at least a firstportion and a second portion;

b) enriching for inaccessible gDNA in the first portion and retainingtotal gDNA in the second portion; and

c) determining the DNA methylation status at the preselected gDNA regionin the first portion and in the second portion, wherein an extent of DNAmethylation in the inaccessible gDNA in the first portion that is about100% and the extent of DNA methylation in the total gDNA in the secondportion is about 50% at the preselected gDNA region indicates or iscorrelated with proper imprinting of the preselected gDNA region, andwherein an extent of DNA methylation in the inaccessible gDNA in thefirst portion that is less than about 100% (e.g., less than about 90%,85%, 80%, 75%) and the extent of DNA methylation in the total gDNA inthe second portion is about 50% at the preselected gDNA region indicatesor is correlated with loss of imprinting of the preselected gDNA region.

In a further aspect, the invention provides methods of determininggenomic imprinting of a preselected gDNA region in a biological sample.In some embodiments, the methods comprise:

a) dividing a biological sample comprising gDNA into at least a firstportion and a second portion;

b) enriching for accessible gDNA in the first portion and retainingtotal gDNA in the second portion; and

c) determining the DNA methylation status at the preselected gDNA regionin the first portion and in the second portion, wherein an extent of DNAmethylation in the accessible gDNA in the first portion that is about100% and the extent of DNA methylation in the total gDNA in the secondportion is about 50% at the preselected gDNA region indicates or iscorrelated with proper imprinting of the preselected gDNA region, andwherein an extent of DNA methylation in the accessible gDNA in the firstportion that is less than about 100% (e.g., less than about 90%, 85%,80%, 75%) and the extent of DNA methylation in the total gDNA in thesecond portion is about 50% at the preselected gDNA region indicates oris correlated with loss of imprinting of the preselected gDNA region.

In a further aspect, the invention provides methods of determininggenomic imprinting of a preselected gDNA region in a biological sample.In some embodiments, the methods comprise:

a) dividing a biological sample comprising gDNA into at least a firstportion and a second portion;

b) enriching the first portion by performing ChIP and retaining totalgDNA in the second portion; and

c) determining the DNA methylation status at the preselected gDNA regionin the first portion and in the second portion, wherein an extent of DNAmethylation in the ChIP-enriched gDNA in the first portion that is about100% and the extent of DNA methylation in the total gDNA in the secondportion is about 50% at the preselected gDNA region indicates or iscorrelated with proper imprinting of the preselected gDNA region, andwherein an extent of DNA methylation in the ChIP-enriched gDNA in thefirst portion that is less than about 100% (e.g., less than about 90%,85%, 80%, 75%) and the extent of DNA methylation in the total gDNA inthe second portion is about 50% at the preselected gDNA region indicatesor is correlated with loss of imprinting of the preselected gDNA region.In various embodiments, ChIP enrichment can be for gDNA associated withor bound to histones containing a modification of interest (e.g.,trimethylation of lysine 4 of histone 3) or gDNA associated with orbound to a protein of interest (e.g., RNA polymerase II).

In a related aspect, the invention provides methods of determininggenomic imprinting of a preselected gDNA region in a biological sample,the method comprising:

a) dividing a biological sample comprising gDNA into at least a firstportion and a second portion;

b) enriching for accessible gDNA in the first portion and retainingtotal gDNA in the second portion; and

c) determining the DNA methylation status at the preselected gDNA regionin the first portion and in the second portion, wherein an extent of DNAmethylation in the accessible gDNA in the first portion that is about 0%(e.g., less than about 5%) and the extent of DNA methylation in thetotal gDNA in the second portion that is about 50% at the preselectedgDNA region indicates or is correlated with proper imprinting of thepreselected gDNA region.

In another aspect, the invention provides methods of detecting abiological, pathological, genetic or epigenetic state of accessible gDNAin a sample. In some embodiments, the methods comprise:

a) dividing a biological sample comprising gDNA into at least a firstportion and a second portion;

b) enriching for accessible gDNA in the first portion and retainingtotal gDNA in the second portion; and

c) determining the DNA methylation status at one or more gDNA regions inthe first portion and in the second portion, wherein a difference in theextent of DNA methylation in the subpopulation of gDNA in the firstportion relative to the extent of DNA methylation in the total gDNA inthe second portion at the one or more gDNA regions indicates or iscorrelated with the biological, pathological, genetic or epigeneticstate in the subpopulation of gDNA.

In another aspect, the invention provides methods of detecting abiological, pathological, genetic or epigenetic state of inaccessiblegDNA in a sample. In some embodiments, the methods comprise:

a) dividing a biological sample comprising gDNA into at least a firstportion and a second portion;

b) enriching for inaccessible gDNA in the first portion and retainingtotal gDNA in the second portion; and

c) determining the DNA methylation status at one or more gDNA regions inthe first portion and in the second portion, wherein a difference in theextent of DNA methylation in the subpopulation of gDNA in the firstportion relative to the extent of DNA methylation in the total gDNA inthe second portion at the one or more gDNA regions indicates or iscorrelated with the biological, pathological, genetic or epigeneticstate in the subpopulation of gDNA.

In another aspect, the invention provides methods of detecting abiological, pathological, genetic or epigenetic state of ChIP-enrichedgDNA in a sample. In some embodiments, the methods comprise:

a) dividing a biological sample comprising gDNA into at least a firstportion and a second portion;

b) enriching for ChIP-enriched gDNA in the first portion and retainingtotal gDNA in the second portion; and

c) determining the DNA methylation status at one or more gDNA regions inthe first portion and in the second portion, wherein a difference in theextent of DNA methylation in the subpopulation of gDNA in the firstportion relative to the extent of DNA methylation in the total gDNA inthe second portion at the one or more gDNA regions indicates or iscorrelated with the biological, pathological, genetic or epigeneticstate in the subpopulation of gDNA. In various embodiments, ChIPenrichment can be for gDNA associated with or bound to histonescontaining a modification of interest (e.g., trimethylation of lysine 4of histone 3) or gDNA associated with or bound to a protein of interest(e.g., RNA polymerase II).

With respect to the embodiments, in some embodiments, the methodsfurther comprising the step of obtaining the biological sample. In someembodiments, the biological sample comprises isolated nuclei. In someembodiments, the biological sample is a population of cells. As neededor appropriate, the population of cells is treated with apermeabilization agent prior to the enrichment step b). The populationof cells may also be treated with a DNA modification agent, e.g., anenzyme, a chemical or a drug that modifies

DNA, prior to enrichment step b). In some embodiments, the population ofcells is treated with a permeabilization agent and a DNA modificationagent prior to the enrichment step b). In various embodiments, thepopulation of cells can be in situ. In some embodiments, the biologicalsample is a solid tissue sample.

In some embodiments, an in situ treatment step is performed prior to theenriching step. For example, the gDNA may be subject to DNA cleavage,DNA modification or cross-linking (e.g., for carrying out chromatinimmunoprecipitation (“ChIP”).

In some embodiments, the enriching step comprises enriching foraccessible chromatin. Accessible gDNA can be enriched by any method inthe art. Generally, the gDNA in the biological sample is modified withthe modifying agent and the modified gDNA is purified, thereby yieldingthe first portion. For example, in various embodiments, the accessiblegDNA can be enriched by contacting the DNA with a DNA modifying agentand then isolating modified DNA, e.g., via affinity purification basedon the modification. The modification is preferably a non-native ornon-naturally occurring modification. An illustrative modification isDNA methylation. The DNA methylation enzyme can modify cytosine at the 4or 6 position or adenine at the 6 position. Other DNA modificationenzymes may be utilized and may methylate other bases at differentpositions. In some embodiments, the modifying agent is a methylationagent that methylates adenine. In some embodiments, the accessible gDNAcan be enriched by contacting the gDNA with an adeninemethyltransferase, and then isolating gDNA modified with 6-methyladenine(6-mA).

In some embodiments, the enriching step comprises enriching forinaccessible chromatin. Inaccessible gDNA can be enriched by any methodin the art. For example, in various embodiments, inaccessible gDNA isenriched by concurrently contacting the biological sample with amodifying agent and a cell membrane disrupting agent, and purifying themodified gDNA, thereby yielding the first portion. In some embodiments,the modifying agent is an enzyme, chemical or drug that cleaves DNA.Examples of enzymes include nucleases such as DNase I and restrictionenzymes. In some embodiments, the modifying agent is a DNA nuclease,e.g., DNase I or Mnase. In some embodiments, the modifying agent is arestriction enzyme. In some embodiments, the modifying agent is anenzyme, chemical or drug that modifies DNA.

In some embodiments, the methods further comprise the step of performingchromatin immunoprecipitation (“ChIP”). For example, enrichment can befor gDNA associated with or bound to histones containing a modificationof interest (e.g., trimethylation of lysine 4 of histone 3) or gDNAassociated with or bound to a protein of interest (e.g., RNA polymeraseII).

The second portion oftentimes represents a control. The control may begDNA from a biological sample that has been treated (e.g., with apharmacological agent or drug) or gDNA from an untreated biologicalsample. In some embodiments, the second portion comprises total gDNA.

The extent of methylation can be determined using any method known inthe art. For example, in various embodiments, the extent of DNAmethylation status is determined via restriction enzyme analysis, e.g.,using methylation-sensing restriction enzymes. As understood by those ofskill and described herein, restriction enzymes that sense DNAmethylation, e.g., methylation-sensitive and/or methylation-dependentenzymes find use. In other embodiments, the extent of DNA methylationstatus is determined by contacting the gDNA with bisulfite and detectingmethylation of bisulfite-modified gDNA, e.g., using any appropriatetechnique. As understood by those of skill and described herein,bisulfite modification is performed and the extent of DNA methylationcan then be determined by various techniques, including withoutlimitation Methylation Specific PCR (“MSP”), COBRA (as described byXiong and Laird, Nucleic Acids Res. (1997) 25(12):2532-4), DNAsequencing, etc. In some embodiments, the extent of DNA methylationstatus is determined via affinity purification, e.g., using proteinbinding or direct or indirect antibody detection. For example, asunderstood by those of skill and described herein, antibodies thatdirectly bind to methylated DNA bases find use to immunoprecipitatemethylated DNA. The immunoprecipiated genomic regions (and containingmethylated DNA) can be detected using known techniques, e.g., PCR, DNAsequencing, microarray, etc. Also, proteins that bind with high affinityto methylated gDNA, e.g., MBD proteins or MeCP2, can be used foraffinity purification of methylated DNA. Antibodies that bind to suchproteins bound to methylated gDNA also can be used to immunoprecipitatemethylated DNA. In other embodiments, the extent of DNA methylationstatus is determined via direct nucleic acid sequencing. For example,single-molecule, real-time (SMRT) DNA sequencing or nanopore sequencingfinds use to directly detect DNA methylation.

In some embodiments, the extent of methylation at the one or more gDNAregions in the first portion is higher than the extent of methylation atthe one or more gDNA regions in the second portion. In some embodiments,the extent of methylation at the one or more gDNA regions in the firstportion is lower than the extent of methylation at the one or more gDNAregions in the second portion.

DEFINITIONS

The term “biological sample” refers to any sample comprising genomicDNA.

“Permeabilizing,” a cell membrane, as used herein, refers to reducingthe integrity of a cell membrane to allow for entry of a modifying agentinto the cell. A cell with a permeabilized cell membrane will generallyretain the cell membrane such that the cell's structure remainssubstantially intact. In contrast, “disrupting” a cell membrane, as usedherein, refers to reducing the integrity of a cell membrane such thatthe cell's structure does not remain intact. For example, contacting acell membrane with a nonionic detergent will remove and/or dissolve acell membrane, thereby allowing access of a modifying agent to genomicDNA that retains at least some chromosomal structure.

A “DNA modifying agent,” as used herein, refers to a molecule thatalters DNA in a detectable manner. For example, addition or removal ofchemical moieties from the DNA are modifications. DNA modifying agentsthat do not result in DNA cleavage include, but are not limited to, DNAmethylases or methyltransferases.

A “DNA cleaving agent,” as used herein, refers to a molecule thatcleaves DNA. For example, a DNA cleaving agent can cause DNA nicking orcleavage.

A “DNA region,” as used herein, refers to a target sequence of interestwithin genomic DNA. The DNA region can be of any length that is ofinterest. In some embodiments, the DNA region is accessible by the DNAmodifying agent being used. In some embodiments, the DNA region caninclude a single base pair, but can also be a short segment of sequencewithin genomic DNA (e.g., 2-100, 2-500, 50-500 bp) or a larger segment(e.g., 100-10,000, 100-1000, or 1000-5000 bp). In some embodiments, theamount of DNA in a DNA region is determined by the amount of sequence tobe amplified in a PCR reaction. For example, standard PCR reactionsgenerally can amplify between about 35 to 5000 base pairs.Alternatively, a DNA region can be a gene or chromosomal region ofinterest.

A different “extent” of modifications refers to a different number(actual or relative) of modified copies of one or more DNA regionsbetween samples or between two or more DNA regions in one or moresamples. For example, if 100 copies of two DNA regions (designated forconvenience as “region A” and “region B”) are each present inchromosomal DNA in a cell, an example of modification to a differentextent would be if 10 copies of region A were modified whereas 70 copiesof region B were modified.

The terms “oligonucleotide” or “polynucleotide” or “nucleic acid”interchangeably refer to a polymer of monomers that can be correspondedto a ribose nucleic acid (RNA) or deoxyribose nucleic acid (DNA)polymer, or analog thereof. This includes polymers of nucleotides suchas RNA and DNA, as well as modified forms thereof, peptide nucleic acids(PNAs), locked nucleic acids (LNA™), and the like. In certainapplications, the nucleic acid can be a polymer that includes multiplemonomer types, e.g., both RNA and DNA subunits.

A nucleic acid is typically single-stranded or double-stranded and willgenerally contain phosphodiester bonds, although in some cases, asoutlined herein, nucleic acid analogs are included that may havealternate backbones, including, for example and without limitation,phosphoramide (Beaucage et al. (1993) Tetrahedron 49(10):1925 and thereferences therein; Letsinger (1970) J. Org. Chem. 35:3800; Sprinzl etal. (1977) Eur. J. Biochem. 81:579; Letsinger et al. (1986) Nucl. AcidsRes. 14: 3487; Sawai et al. (1984) Chem. Lett. 805; Letsinger et al.(1988) J. Am. Chem. Soc. 110:4470; and Pauwels et al. (1986) ChemicaScripta 26:1419), phosphorothioate (Mag et al. (1991) Nucleic Acids Res.19:1437 and U.S. Pat. No. 5,644,048), phosphorodithioate (Briu et al.(1989) J. Am. Chem. Soc. 111:2321), O-methylphophoroamidite linkages(Eckstein, Oligonucleotides and Analogues: A Practical Approach, OxfordUniversity Press (1992)), and peptide nucleic acid backbones andlinkages (Egholm (1992) J. Am. Chem. Soc. 114:1895; Meier et al. (1992)Chem. Int. Ed. Engl. 31:1008; Nielsen (1993) Nature 365:566; andCarlsson et al. (1996) Nature 380:207), which references are each herebyincorporated herein by reference. Other analog nucleic acids includethose with positively charged backbones (Denpcy et al. (1995) Proc.Natl. Acad. Sci. USA 92:6097); non-ionic backbones (U.S. Pat. Nos.5,386,023, 5,637,684, 5,602,240, 5,216,141 and 4,469,863; Angew (1991)Chem. Intl. Ed. English 30: 423; Letsinger et al. (1988) J. Am. Chem.Soc. 110:4470; Letsinger et al. (1994) Nucleoside & Nucleotide 13:1597;Chapters 2 and 3, ASC Symposium Series 580, “Carbohydrate Modificationsin Antisense Research”, Ed. Y. S. Sanghvi and P. Dan Cook; Mesmaeker etal. (1994) Bioorganic & Medicinal Chem. Lett. 4: 395; Jeffs et al.(1994) J. Biomolecular NMR 34:17; Tetrahedron Lett. 37:743 (1996)) andnon-ribose backbones, including those described in U.S. Pat. Nos.5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580,Carbohydrate Modifications in Antisense Research, Ed. Y. S. Sanghvi andP. Dan Cook, which references are each hereby incorporated herein byreference. Nucleic acids containing one or more carbocyclic sugars arealso included within the definition of nucleic acids (Jenkins et al.(1995) Chem. Soc. Rev. pp 169-176, which is incorporated by reference).Several nucleic acid analogs are also described in, e.g., Rawls, C & ENews Jun. 2, 1997 page 35, which is incorporated by reference. Thesemodifications of the ribose-phosphate backbone may be done to facilitatethe addition of additional moieties such as labeling moieties, or toalter the stability and half-life of such molecules in physiologicalenvironments.

In addition to naturally occurring heterocyclic bases that are typicallyfound in nucleic acids (e.g., adenine, guanine, thymine, cytosine, anduracil), nucleic acid analogs also include those having non-naturallyoccurring heterocyclic or other modified bases, many of which aredescribed, or otherwise referred to, herein. In particular, manynon-naturally occurring bases are described further in, e.g., Seela etal. (1991) Helv. Chim. Acta 74:1790, Grein et al. (1994) Bioorg. Med.Chem. Lett. 4:971-976, and Seela et al. (1999) Helv. Chim. Acta 82:1640,which are each incorporated by reference. To further illustrate, certainbases used in nucleotides that act as melting temperature (Tm) modifiersare optionally included. For example, some of these include7-deazapurines (e.g., 7-deazaguanine, 7-deazaadenine, etc.),pyrazolo[3,4-d]pyrimidines, propynyl-dN (e.g., propynyl-dU, propynyl-dC,etc.), and the like. See, e.g., U.S. Pat. No. 5,990,303, entitled“SYNTHESIS OF 7-DEAZA-2′-DEOXYGUANOSINE NUCLEOTIDES,” which issued Nov.23, 1999 to Seela, which is incorporated by reference. Otherrepresentative heterocyclic bases include, e.g., hypoxanthine, inosine,xanthine; 8-aza derivatives of 2-aminopurine, 2,6-diaminopurine,2-amino-6-chloropurine, hypoxanthine, inosine and xanthine;7-deaza-8-aza derivatives of adenine, guanine, 2-aminopurine,2,6-diaminopurine, 2-amino-6-chloropurine, hypoxanthine, inosine andxanthine; 6-azacytosine; 5-fluorocytosine; 5-chlorocytosine;5-iodocytosine; 5-bromocytosine; 5-methylcytosine; 5-propynylcytosine;5-bromovinyluracil; 5-fluorouracil; 5-chlorouracil; 5-iodouracil;5-bromouracil; 5-trifluoromethyluracil; 5-methoxymethyluracil;5-ethynyluracil; 5-propynyluracil, and the like.

“Accessibility” of a DNA region or “accessible DNA” interchangeablyrefers to the ability of a particular DNA region in a chromosome of acell to be contacted and modified by a particular DNA cleaving ormodifying agent. Without intending to limit the scope of the invention,it is believed that the particular chromatin structure comprising theDNA region will affect the ability of a DNA cleaving or modifying agentto cleave or modify the particular DNA region. For example, the DNAregion may be wrapped around histone proteins and further may haveadditional nucleosomal structure that prevents, or reduces access of,the DNA cleaving or modifying agent to the DNA region of interest.Accessibility can therefore be detected as a function of the quantity ofcleavage or modification. Relative accessibility between two DNA regionscan be determined by comparing (e.g., generating a ratio) of cleavage ormodification levels between the two regions.

A “heterologous sequence” or a “heterologous nucleic acid”, as usedherein, is one that originates from a source foreign to the particularhost cell, or, if from the same source, is modified from its originalform. Thus, a heterologous expression cassette in a cell is anexpression cassette that is not endogenous to the particular host cell,for example by being linked to nucleotide sequences from an expressionvector rather than chromosomal DNA, being linked to a heterologouspromoter, being linked to a reporter gene, etc.

A “Type II-S restriction enzyme” is used with its usual meaning the artand refers to a restriction enzyme that recognizes a particularrecognition sequence in DNA and then cleaves the DNA molecule outside ofthat recognition sequence. Exemplary Type II-S restriction enzymesinclude, but are not limited to, MnII, FokI, and AlwI.

The term “individual,” “patient,”, “subject” interchangeably refer to amammal, for example, a human, a non-human primate, a domesticated mammal(e.g., a canine or a feline), an agricultural mammal (e.g., equine,bovine, ovine, porcine), or a laboratory mammal (e.g., rattus, murine,lagomorpha, hamster).

The terms “direct” or “directly” interchangeably refer to theperformance of two contiguous method steps without performing anyintervening method steps.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a schematic representation of the chromatin.Eukaryotic DNA can be classified into two general states, euchromatin,where the DNA is loosely packaged, accessible and transcriptionallycompetent and heterochromatin, where the DNA is tightly packaged,inaccessible and transcriptionally silent. Epigenetics controls thetransition between these two states.

FIG. 2 illustrates a schematic representation of a procedure for theisolation of the subpopulation of gDNA that corresponds to inaccessiblechromatin. Cells are treated with a buffer that permeabilizes the celland contains a nuclease. The nuclease diffuses into the cell, enters thenucleus and digests accessible chromatin, but inaccessible chromatin(represented as a thick line towards the bottom of the Figure) is notdigested. After gDNA purification, DNA that was originally in aninaccessible chromatin configuration will be enriched relative to DNAthat was originally in an accessible chromatin configuration.

FIG. 3 illustrates a procedure for the isolation of the subpopulation ofgDNA that corresponds to accessible chromatin. Permeabilized cells orisolated nuclei are treated with an agent that modifies accessiblechromatin, but does not modify inaccessible chromatin. Total gDNA can beisolated and sheared into fragments of 50 bp to 1 kb or so in size. ThegDNA fragments containing sites of modification are then purified by anappropriate method. The purified gDNA represents the subpopulation ofgDNA that was originally in an accessible chromatin configuration.

DETAILED DESCRIPTION

1. Introduction

The present invention provides processes by which a difference in theDNA methylation state of a genomic DNA region can be identified in asubpopulation of genomic DNA (gDNA). The DNA methylation status in thesubpopulation can be determined relative to the total gDNA population orrelative to the same enriched subpopulation in a parallel sample, e.g.,of a control sample, e.g., that is of a differentphysiological/pathophysiological state or that has been treated oruntreated with an agent (e.g., a pharmacological agent or drug). Therelative DNA methylation information of the gDNA subpopulation canprovide insight on, e.g., epigenetic changes in a subpopulation of cellsand/or epigenetic differences between two alleles.

The identification of epigenetic changes in a subpopulation of cells canhave significant biomedical relevance. For example, analysis of biopsysamples may identify a subpopulation of gDNA that has an aberrant DNAmethylation profile. Such information may indicate or is correlated withthe presence of a malignancy or pre-neoplastic lesion in a background ofhealthy cells.

The identification of epigenetic differences between the two alleles canalso have biomedical relevance. Such information can determine if aspecific genomic region is properly imprinted (one allele is active andthe other is epigenetically silenced); imprinting defects are associatedwith numerous genetic disorders. This information can also detect lossof imprinting, a phenomena associated with the development andprogression of several cancers.

The present invention identifies differences in DNA methylationemploying two general steps: (1) enrichment of a subpopulation of gDNAand (2) analysis and comparison of relative DNA methylation in thesubpopulation of gDNA relative to a reference population of gDNA, e.g.,to the DNA methylation of total gDNA or a control sample of the samesubpopulation of gDNA. Enrichment of a subpopulation of gDNA prior toanalyzing the extent, quality or patterning of methylation providessuperior detection sensitivity in comparison to establishedmethodologies. For example, the present methods can detect cancer in abiopsy sample which contains a small portion (e.g., 1% or less) ofcancerous cells. A tumor biopsy can be effectively screened for a DNAmethylation biomarker that is found in cancerous cells but not in normaltissue, e.g., the GSTP1 promoter.

The present methods offer advantages over the established methodologies.This is because the accessible or inaccessible subpopulation of genomicDNA may be enriched for DNA associated with cancerous cells. Performingthe present methods, it is possible to observe over 99% nucleasedigestion of accessible DNA regions. Thus, in a biopsy sample that is 1%cancerous, a DNA methylation biomarker that is associated withinaccessible DNA will be enriched 50-fold after nuclease digestion ofaccessible DNA.

2. Obtaining, Processing and Dividing a Biological Sample

The methods evaluate genomic DNA in a biological sample. In variousembodiments, the biological sample can comprise body fluids, tissues,cells, isolated nuclei or isolated genomic DNA. In some embodiments, themethods further comprise the step of obtaining the biological sample orsamples.

In some embodiments, the biological sample comprises cells. A variety ofeukaryotic cells can be used in the present invention. In someembodiments, the cells are animal cells, including but not limited to,human, or non-human, mammalian cells. Non-human mammalian cells includebut are not limited to, primate cells, mouse cells, rat cells, porcinecells, and bovine cells. In some embodiments, the cells are plant cells.Cells can be, for example, cultured primary cells, immortalized culturecells or can be from a biopsy or tissue sample, optionally cultured andstimulated to divide before assayed. Cultured cells can be in suspensionor adherent. Cells can be from animal tissues, biopsies, etc. Forexample, the cells can be from a tumor biopsy, a hair bulb, a cheek swabor another solid tissue sample. In some embodiments, the biologicalsample is a fluid sample. For example, the biological sample can be fromblood, serum, plasma, semen, urine, saliva, amniotic fluid, or atissue/cell culture suspension.

In some embodiments, the biological sample is from a tissue suspected ofbeing cancerous. In various embodiments, the biological sample is from abiopsy, for example, solid tissue, for example, an epithelial tissue.Exemplary epithelial tissues include without limitation thyroid, adrenalgland, bladder, uterus, breast, prostate, testicular, liver, lung,cervical, ovary, skin, gastrointestinal, colorectal, kidney, bladder,pancreas, stomach, brain, esophagus tissue, etc. In the case ofsuspected hematological cancers, the biological sample can be a bloodsample. Cells of interest or suspected of being cancerous in the bloodsample can be isolated or enriched according to known techniques beforeisolating gDNA and enriching for a gDNA subpopulation. One well knowntechnique uses antibodies conjugated to magnetic beads or anotherselectable label, wherein the antibodies bind to a cell surface markerof interest. For example, B cell, T-cells, or macrophages, or anotherblood cell subset suspected of being cancerous can be isolated orenriched prior to isolating gDNA, e.g., using antibodies that bind to asurface antigen commonly expressed by the cell population or antibodiesthat bind to a known cancer-associated cell surface marker. Those ofskill recognize that cancerous cells in the blood can be isolated orenriched using an antibody-based isolation or enrichment scheme. It isalso possible that tissue or solid samples may be disrupted and asubpopulation isolated or enriched by antibody selection. Assessing theDNA methylation status of accessible/inaccessible DNA in a selected cellpopulation is contemplated by the present invention.

The methods generally comprise comparing a divided biological sample,e.g., divided into a first portion and a second portion, wherein asubpopulation of genomic DNA is enriched in one of the portions. Thesecond portion can retain total genomic DNA. Alternatively, the methodscan comprise comparing two biological samples, both enriched for thesame subpopulation of genomic DNA. For example, the first biologicalsample can be from normal tissue and the second biological sample can befrom tissue suspected of being cancerous or pre-cancerous.Alternatively, the first biological sample can be treated with an agent,e.g., a chemical or pharmaceutical agent, and the second biologicalsample can be an untreated control sample. Preferably, in methodscomparing two biological samples, both enriched for the samesubpopulation of genomic DNA, the biological samples are from the sametissue type.

In various embodiments, for example where the biological samplecomprises cells or tissues, the methods further comprise the step oftreating the biological sample to allow the modifying agent to accessthe chromatin. Any treatment known in the art finds use. Illustrativetreatments to apply to the cells or tissues to facilitate accessibilityof chromatin to a modifying agent include without limitation, e.g.,expression of the modifying agent in living cells (e.g., DamID);permeabilization of cells to allow access of the modifying agent tochromatin; isolation of cell nuclei followed by diffusion of themodifying agent into the nuclei such that it can modify chromatin; anddisruption of cells to release chromatin followed by, or simultaneouswith, treatment with the modifying agent.

Chromatin can be exposed to a modifying agent. The modifying agentgenerally preferentially modifies accessible chromatin in comparison toits ability to modify inaccessible chromatin. Any agents that modifygenomic DNA find use, and those that introduce non-endogenous ornon-naturally occurring modifications are more easily detected.Illustrative modifying agents include nucleases that digest DNA inaccessible chromatin such that only the DNA that was in inaccessiblechromatin remains. The nuclease can be any agent that digests ordegrades DNA and includes chemicals, restriction enzymes and nucleases(e.g., DNase I). Additional gDNA modifying agents that find use includethose that place a “mark” on the DNA in accessible chromatin. Forexample, the agent can be a DNA methyltransferase where the mark will bea methyl group. To analyze higher eukaryotes a DNA methyltransferasethat modifies any residue that is not cytosine (e.g., adenine) can beparticularly useful. Methylated adenine is not a natural base ineukaryotic cells and could thus serve as a distinguishing feature. TheDAM methyltransferase is an example of an adenine methyltransferase.

3. Enriching for a Subpopulation of Genomic DNA (gDNA)

The methods evaluate the methylation status or extent or type ofmethylation in a subpopulation of genomic DNA. The methylation status orextent or type of methylation in the subpopulation of genomic DNA iscompared to a reference, for example, total genomic DNA or genomic DNAenriched for the same subpopulation, e.g., that has been exposed to anagent or from neighboring tissue (known to be cancerous, known to bepre-cancerous, known to be non-cancerous, etc.).

Generally, the genomic DNA can enriched for accessible or inaccessiblesubpopulations. In some embodiments, the genomic DNA is enriched forregions in duplex with RNA. Various embodiments of the methods furtherinclude the step of enriching for histones bearing modificationsspecifically recognized by antibodies. Illustrative histonemodifications of interest for enrichment include, e.g., Histone 3;lysine 4 mono, di and/or tri methylated Histone 3; lysine 9, mono, diand/or tri methylated Histone 3; lysine 9, acetylated Histone 3; lysine27, mono, di and/or tri methylated Histone 3; lysine 27, acetylatedHistone 3; lysine 36, mono, di and/or tri methylated Histone 3; lysine79, mono, di and/or tri methylated Histone 4; lysine 20, mono, di and/ortri methylated acetylated Histone H3; and Acetylated Histone H4.

a. Enriching for Inaccessible gDNA

Enriching for inaccessible DNA can be accomplished by any method in theart. One method, described in U.S. application Ser. No. 12/618,076,involves simultaneous permeabilization of a cell and contacting the cellwith a DNA cleaving agent under conditions such that accessible genomicDNA is cleaved, thereby enriching for inaccessible genomic DNA.

In various embodiments, inaccessible chromatin is isolated when themodifying agent is a nuclease. The nuclease treatment is such thatall/most of the DNA that was in accessible chromatin is degraded leavingan enriched subpopulation of DNA that was in inaccessible chromatin.gDNA isolated from a portion of the biological sample that is nottreated with the nuclease represents total DNA.

i. Permeabilizing and Disrupting Cells

Cell membranes can be permeabilized or disrupted in any way known in theart. As explained herein, the present methods involve contacting thegenomic DNA prior to isolation of the DNA and thus methods ofpermeabilizing or disrupting the cell membrane will not disrupt thestructure of the genomic DNA of the cell such that nucleosomal orchromatin structure is destroyed or perturbed.

In some embodiments, the cell membrane is contacted with an agent thatpermeabilizes or disrupts the cell membrane. Lysolipids are an exemplaryclass of agents that permeabilize cell membranes. Exemplary lysolipidsinclude, but are not limited to, lysophosphatidylcholine (also known inthe art as lysolecithin) or monopalmitoylphosphatidylcholine. A varietyof lysolipids are also described in, e.g., WO 2003/052095.

Non-ionic detergents are an exemplary class of agents that disrupt cellmembranes. Exemplary nonionic detergents, include but are not limitedto, NP40, Tween 20 and Triton X-100.

One advantage of the present invention is the simultaneous delivery ofthe permeabilization agent and the DNA cleaving or DNA modifying agent.Thus, in some embodiments, a buffer comprising both agents is contactedto the cell. The buffer should be adapted for maintaining activity ofboth agents while maintaining the structure of the cellular chromatin.

Alternatively, electroporation or biolistic methods can be used topermeabilize a cell membrane such that a DNA modifying agent isintroduced into the cell and can thus contact the genomic DNA. A widevariety of electroporation methods are well known and can be adapted fordelivery of DNA modifying agents as described herein. Exemplaryelectroporation methods include, but are not limited to, those describedin WO/2000/062855. Biolistic methods include but are not limited tothose described in U.S. Pat. No. 5,179,022.

ii. Contacting with a DNA Cleaving Agent

Following permeabilization, or simultaneously with permeabilization(e.g., during electroporation or during incubation with permeabilizingagent), a DNA cleaving agent is introduced such that the agent contactsthe genomic DNA, thereby introducing modifications into the DNA. A widevariety of DNA cleaving agents can be used according to the presentinvention.

In some embodiments, the DNA cleaving agents are contacted to thepermeabilized cells following removal of the permeabilizing agent,optionally with a change of the buffer. Alternatively, in some preferredembodiments, the DNA cleaving agent is contacted to the genomic DNAwithout one or more intervening steps (e.g., without an exchange ofbuffers, washing of the cells, etc.). As noted above, this latterapproach can be convenient for reducing the amount of labor and timenecessary and also removes a potential source of error and contaminationin the assay.

The quantity of DNA cleaving agent used, as well as the time of thereaction with the DNA cleaving agent will depend on the agent used.Those of skill in the art will appreciate how to adjust conditionsdepending on the agent used. Generally, the conditions of the DNAmodifying step are adjusted such that a “complete” digestion is notachieved. Thus, for example, in some embodiments, the conditions of themodifying step is set such that the positive control—i.e., the controlwhere modification is accessible and occurs—occurs at a high level butless than 100%, e.g., between 80-95%, 80-99%, 85-95%, 90-98%, etc.

Restriction Enzymes

In some embodiments, the DNA cleaving agent is a restriction enzyme.

Thus, in these embodiments, the modification introduced into the genomicDNA is a sequence-specific single-stranded (e.g., a nick) ordouble-stranded cleavage event. A wide variety of restriction enzymesare known and can be used in the present invention.

Any type of restriction enzyme can be used. Type I enzymes cut DNA atrandom far from their recognition sequences. Type II enzymes cut DNA atdefined positions close to or within their recognition sequences. SomeType II enzymes cleave DNA within their recognition sequences. Type II-Senzymes cleave outside of their recognition sequence to one side. Thethird major kind of type II enzyme, more properly referred to as “typeIV,” cleave outside of their recognition sequences. For example, thosethat recognize continuous sequences (e.g., Acul: CTGAAG) cleave on justone side; those that recognize discontinuous sequences (e.g., BcgI:CGANNNNNNTGC) cleave on both sides releasing a small fragment containingthe recognition sequence. Type III cleave outside of their recognitionsequences and require two such sequences in opposite orientations withinthe same DNA molecule to accomplish cleavage.

The methods of the invention can be adapted for use with any type ofrestriction enzyme or other DNA cleaving enzyme. In some embodiments,the enzyme is one or more that cleaves relatively close (e.g., within 5,10, or 20 base pairs) of the recognition sequence. Such enzymes can beof particular use in assaying chromatin structure as the span of DNAthat must be accessible to achieve cutting is larger than therecognition sequence itself and thus may involve a wider span of DNAthat is not in a “tight” chromatin structure. Sequence-specificrestriction enzymes can provide improved quantitative results in partbecause controls based on the same DNA region can be designed asdescribed herein (e.g., in the Examples). Thus, the number of total anddigested copies can be more accurately determined compared to, e.g.,digestion with sequence non-specific endonucleases (“DNases”).Illustrative enzymes that cut outside their recognition sequenceincludes, e.g., Type II-S, Type III, and Type IV enzymes. Type II-Srestriction enzymes, include but are not limited to, MnII, FokI andAlwI.

In some embodiments, more than one (e.g., two, three, four, etc.)restriction enzymes are used. Combinations of enzymes can involvecombinations of enzymes all from one type or can be mixes of differenttypes.

Intact or cut DNA can subsequently be separately detected and quantifiedand the number of intact and/or cut copies of a DNA region can bedetermined as described herein.

In some embodiments, the permeabilizing or membrane disrupting agent isadded prior to the restriction enzyme. In some embodiments, therestriction enzyme and permeabilizing or disrupting agent are addedsimultaneously (e.g., in or with appropriate buffers). Even if bothagents are not initially contacted to a cell at the same moment, one canstill achieve simultaneous permeabilization and contact with a DNAcleaving agent because permeabilization can be an ongoing process. Thus,for example, addition of a permeabilizing agent followed soon after(before permeabilization is substantially complete) with a DNA modifyingagent can be considered “simultaneously” permeabilizing and contactingthe cell with the DNA modifying agent. “Simultaneous” means nointervening manipulations occur (including but not limited to change ofbuffer, centrifugation, etc.) between addition of the permeabilizationand modifying agent.

In some embodiments, 0.5% lysolecithin (w/v), 50 mM NaCl, 10 mM Tris-HClpH 7.4, 10 mM MgCl₂, 1 mM DTT, 100 μg/ml BSA and 0-500 units/ml MnII (orother restriction enzyme) are used. In some embodiments, 0.25%lysolecithin (w/v), 50 mM NaCl, 10 mM Tris-HCl pH 7.4, 10 mM MgCl₂, 1 mMDTT, 100 μg/ml BSA and 0-500 units/ml MnII (or other restriction enzyme)are used. In some embodiments, 0.75% lysolecithin (w/v), 50 mM NaCl, 10mM Tris-HCl pH 7.4, 10 mM MgCl₂, 1 mM DTT, 100 μg/ml BSA and 0-500units/ml MnII (or other restriction enzyme) are used. In someembodiments, 1% lysolecithin (w/v), 50 mM NaCl, 10 mM Tris-HCl pH 7.4,10 mM MgCl₂, 1 mM DTT, 100 μg/ml BSA and 0-800 units/ml MnII (or otherrestriction enzyme) are used.

Following permeabilization and digestion, the digestion optionally isstopped and the cells are lysed, optionally by simultaneous addition ofa lysis/stop buffer and/or increased temperature. Exemplary lysis/stopbuffers can include sufficient chelator and detergent to stop thereaction and to lyse the cells. For example, in some embodiments, thelysis/stop buffer comprises 100 mM Tris-HCl pH 8, 100 mM NaCl, 100 mMEDTA, 5% SDS (w/v) and 3 mg/ml proteinase K. In some embodiments, thelysis/stop buffer comprises 100 mM Tris-HCl pH 8, 100 mM NaCl, 100 mMEDTA, 1% SDS (w/v) and 3 mg/ml proteinase K. In some embodiments, thelysis/stop buffer comprises 200 mM Tris-HCl pH 8, 100 mM NaCl, 500 mMEDTA, 5% SDS (w/v) and 5 mg/ml proteinase K.

DNases

In some embodiments, an enzyme that cuts or nicks DNA in a sequencenon-specific manner is used as a DNA modifying agent. Thus, in someembodiments, the DNA modifying agent is a sequence non-specificendonuclease (also referred to herein as a “DNase”).

Any sequence non-specific endonuclease (e.g., any of DNase I, II, III,IV, V, VI, VII) can be used according to the present invention. Forexample, any DNase, including but not limited to, DNase I can be used.DNases used can include naturally occurring DNases as well as modifiedDNases. An example of a modified DNase is TURBO DNase (Ambion), whichincludes mutations that allow for “hyperactivity” and salt tolerance.Exemplary DNases, include but are not limited, to Bovine PancreaticDNase I (available from, e.g., New England Biolabs). Also of use aredouble strand DNases (dsDNases). One example from a dsDNase is theshrimp dsDNase, e.g., offered by Marine Biochemicals(marinebiochem.com).

Intact DNA can subsequently be separately detected and quantified andthe number of intact and/or cut copies of a DNA region can bedetermined.

In some embodiments, the permeabilizing or membrane disrupting agent isadded prior to the DNase. In some embodiments, the DNase andpermeabilizing or disrupting agent are added simultaneously (e.g., withappropriate buffers). In some embodiments, thepermeabilization/digestion buffer comprises 0.25% lysolecithin (w/v), 10mM Tris-HCl pH 7.4, 2.5 mM MgCl₂, 0.5 mM CaCl₂ and 0-200 units/ml DNaseI. In some embodiments, the permeabilization/digestion buffer comprises0.5% lysolecithin (w/v), 10 mM Tris-HCl pH 7.4, 2.5 mM MgCl₂, 0.5 mMCaCl₂ and 0-200 units/ml DNase I. In some embodiments, thepermeabilization/digestion buffer comprises 0.75% lysolecithin (w/v), 10mM Tris-HCl pH 7.4, 2.5 mM MgCl₂, 0.5 mM CaCl₂ and 0-500 units/ml DNaseI.

In some embodiments, the permeabilization/digestion buffer comprises0.25% lysolecithin (w/v), 10 mM Tris-HCl pH 7.4, 2.5 mM MgCl₂, 0.5 mMCaCl₂ and 0-500 units/ml DNase I. Permeabilization and lysis can bestopped, for example, as described above for restriction enzymes.

Use of a DNase or other general DNA cleaving agent can be enhanced bymonitoring extent of cleavage between at least two different DNAregions, one being the target, and the other being a DNA region that isgenerally always accessible or is generally always inaccessible in anyof the test conditions. Examples of such genes are discussed elsewhereherein and are known or can be identified. For example, DNA regionsencompassing “housekeeping” genes are generally always accessible. Therelative amount of remaining target compared to the control can then beused to determine relative chromatin structure at the target DNA region.

Size Selection

Inaccessible gDNA can also be enriched by size selection. gDNA fragmentsthat were exposed to a cleaving agent can be fractionated according tosize using any method in the art, including, e.g., gel filtration,electrophoresis, size exclusion, fractionation on a sucrose gradient orpurification on a commercially available device such as the Pippin Prep(Sage Science, on the internet at sagescience.com) or the LabChip XT(Caliper Life Sciences, on the internet at caliperls.com). Accessiblechromatin regions will be relatively smaller in size; inaccessiblechromatin regions will be relatively larger. The relatively larger gDNAfragments representative of inaccessible gDNA or the relatively smallergDNA fragments representive of accessible gDNA are used for subsequentDNA methylation determination. For example, in some embodiments, the DNAis selected for fragments larger than 100, 500, or 1000 base pairs orother sizes, including but not limited to, 500-1000 or -2000 or -3000 or-8000 base pairs.

b. Enriching for Accessible gDNA

Modifying agents that preferentially introduce non-naturally occurringmodifications into the accessible gDNA find use for the enrichment ofaccessible gDNA. Following treatment of the biological sample with themodifying agent, the genomic DNA is isolated and can optionally besheared into fragments. The subpopulation of DNA that was in accessiblechromatin is then purified using an affinity agent that recognizes the“mark.” DNA isolated from a portion of the biological sample that is nottreated with modifying agent represents total DNA. Applicable methodsfor enriching for accessible gDNA are known in the art and find use.

DNA Modifying Agents

In methods for enriching for accessible gDNA, a DNA modifying agent isintroduced into a nucleus having genomic DNA under such conditions thatthe DNA modifying agent modifies the genomic DNA in the nucleus suchthat the modification is not naturally occurring. A wide variety of DNAmodifying agents can be used according to the present invention,including but not limited to enzymes, proteins, and chemicals.

In some embodiments, the DNA modifying agent is introduced into anisolated nucleus. In some embodiments, the DNA modifying agent isintroduced into a nucleus in a cell following permeabilization, orsimultaneously with permeabilization (e.g., during electroporation orduring incubation with permeabilizing agent).

In some embodiments, the DNA modifying agents are contacted topermeabilized cells following removal of the permeabilizing agent,optionally with a change of the buffer. Alternatively, in some preferredembodiments, the DNA modifying agent is contacted to the genomic DNAwithout one or more intervening steps (e.g., without an exchange ofbuffers, washing of the cells, etc.). As noted above, this latterapproach can be convenient for reducing the amount of labor and timenecessary and also removes a potential source of error and contaminationin the assay.

The quantity of DNA modifying agent used, as well as the time of thereaction with the DNA modifying agent will depend on the agent used.Those of skill in the art will appreciate how to adjust conditionsdepending on the agent used. Generally, the conditions of the DNAmodifying step are adjusted such that a “complete” modification is notachieved. Thus, for example, in some embodiments, the conditions of themodifying step is set such that for the positive control—i.e., thecontrol where modification is accessible and occurs—the number of copiesof that positive control DNA region that are modified is at least about10%, at least about 15%, 20%, 25%, 30%, 40%, or more.

Methyltransferases

In some embodiments of the invention, the DNA modifying agent generatesa covalent modification to the DNA. In some embodiment, a DNAmethyltransferase is used to enrich for accessible genomic DNA. Avariety of methyltransferases are known in the art and find use. DNAmethyltransferases covalently modify specific bases in DNA bymethylating them. In mammalian genomes methylation of cytosine at the5-position is a common epigenetic mark that is associated with genesilencing. Other types of DNA modification, such as methylation ofadenine at the 6-postion (6-mA), occurs in bacteria and lowereukaryotes, but is not found in mammals including humans. A DNAmethyltransferase that catalyzes methylation of adenine at the6-position could thus be used to mark mammalian chromatin with adistinguishing feature.

In some embodiments, the methyltransferase used adds a methyl moiety toadenosine in DNA. Examples of such methyltransferases include, but arenot limited to, E. coli DAM methyltransferase, M.TaqI, M.EcoRV, M.Fokl,and M.EcoRl. Because adenosine generally is not methylated in eukaryoticcells, the presence of a methylated adenosine in a particular DNA regionindicates that a DAM methyltransferase, M.Taql, M.EcoRV, M.FokI, andM.EcoRl (or other methyltransferase with similar activity) was able toaccess the DNA region.

In some embodiments, the methyltransferase methylates cytosines in GCsequences. Examples of such methyltransferases include, but are notlimited to, M.CviPl. See, e.g., Xu et al., Nucl. Acids Res. 26(17):3961-3966 (1998). Because GC sequences generally are not methylated ineukaryotic cells, the presence of a methylated GC sequence in aparticular DNA region indicates that the DNA modifying agent (i.e., amethyltransferase that methylates cytosines in GC sequences) was able toaccess the DNA region.

In some embodiments, the methyltransferase methylates cytosines in CG(also known as “CpG”) sequences. Examples of such methyltransferasesinclude, but are not limited to, M.Sssl. Use of such methyltransferaseswill generally be limited to use for those DNA regions that are nottypically methylated. This is because CG sequences are endogenouslymethylated in eukaryotic cells and thus it is not generally possible toassume that a CG sequence is methylated by the modifying agent ratherthan an endogenous methyltransferase except in such DNA regions wheremethylation is rare.

Other suitable methyltransferases that are known in the art include, forexample, methyltransferases that methylate cytosine at the N4 position(e.g., M.BamHI and M.PvuIl) and methyltransferases that methylatecytosine at the C5 position (e.g., M.Hhal). Alternatively, mutated orgenetically engineered methyltransferases that exhibit altered DNAtarget-site specificity or altered DNA modification specificity can beused.

To isolate the subpopulation of gDNA that was in accessible chromatin, acell sample can be split into two portions. The first portion can betreated with a methyltransferase (e.g., an adenine methyltransferase) insitu to mark accessible chromatin, the second portion is not treatedwith the methyltransferase. gDNA can then be isolated from both cellportions and the DNA can be sheared to a constant size by anyappropriate means. The gDNA from the first portion can then be subjectedto affinity purification with an antibody that recognizes 6-mA. Thepurified gDNA represents the subpopulation of gDNA that was in anaccessible chromatin structure. The second portion of gDNA is analyzedas is and represents total gDNA. See, FIG. 3.

Chemicals

In some embodiments, the DNA modifying agent comprises a DNA modifyingchemical. As most DNA modifying chemicals are relatively small comparedto chromatin, use of DNA modifying chemicals without a fusion partnermay not be effective in some circumstances as there will be little ifany difference in the extent of accessibility of different DNA regions.Therefore, in some embodiments, the DNA modifying agent comprises amolecule having steric hindrance linked to a DNA modifying chemical. Themolecule having steric hindrance can be any protein or other moleculethat results in differential accessibility of the DNA modifying agentdepending on chromatin structure. This can be tested, for example, bycomparing results to those using a methyltransferase.

In some embodiments, the molecule having steric hindrance will be atleast 5, 7, 10, or 15 kD in size. Those of skill in the art will likelyfind it convenient to use a polypeptide as the molecule with sterichindrance. Any polypeptide can be used that does not significantlyinterfere with the DNA modifying agent's ability to modify DNA. In someembodiments, the polypeptide is a double-stranded sequence-non-specificnucleic acid binding domain as discussed in further detail below.

The DNA modifying chemicals of the present invention can be linkeddirectly to the molecule having steric hindrance or via a linker. Avariety of homo- and hetero-bifunctional linkers are known and can beused for this purpose.

Exemplary DNA modifying chemicals include but are not limited tohydrazine (and derivatives thereof, e.g., as described in Mathison etal., Toxicology and Applied Pharmacology 127(1):9 1-98 (1 994)) anddimethyl sulfate. In some embodiments, hydrazine introduces a methylgroup to guanine in DNA or otherwise damages DNA. In some embodiments,dimethyl sulfate methylates guanine or results in the base-specificcleavage of guanine in DNA by rupturing the imidazole rings present inguanine.

DNA Binding Domains to Improve DNA Modifying Agents

In some embodiments, the DNA modifying agents used for enrichment ofaccessible gDNA are fused or otherwise linked to a double-strandedsequence-non-specific nucleic acid binding domain (e.g., a DNA bindingdomain). In cases where the DNA modifying agent is a polypeptide, thedouble-stranded sequence-non-specific nucleic acid binding domain can besynthesized, for example, as a protein fusion with the DNA modifyingagent via recombinant DNA technology. A double-strandedsequence-non-specific nucleic acid binding domain is a protein ordefined region of a protein that binds to double-stranded nucleic acidin a sequence-independent manner, i.e., binding does not exhibit a grosspreference for a particular sequence. In some embodiments,double-stranded nucleic acid binding proteins exhibit a 10-fold orhigher affinity for double-stranded versus single-stranded nucleicacids. The double-stranded nucleic acid binding proteins in someembodiments of the invention are thermostable. Examples of such proteinsinclude, but are not limited to, the Archaeal small basic DNA bindingproteins Sac7d and Sso7d (see, e.g., Choli et al., Biochimica etBiophysica Acta 950: 193-203, 1988; Baumann et al., Structural Biol. 1:808-819, 1994; and Gao et al, Nature Struc. Biol. 5:782-786, 1998),Archael HMf-like proteins (see, e.g., Starich et al., J. Molec. Biol.255: 187-203, 1996; Sandman et al., Gene 150:207-208, 1994), and PCNAhomologs (see, e.g., Cann et al., J. Bacteriology 181:6591-6599, 1999;Shamoo and Steitz, Ce11:99, 155-1 66, 1999; De Felice et al., J. Molec.Biol. 291,47-57, 1999; and Zhang et al., Biochemistry 34:10703-10712,1995). See also European Patent 1283875B1 for addition informationregarding DNA binding domains.

Sso 7d and Sac 7d

Sso7d and Sac7d are small (about 7,000 kd MW), basic chromosomalproteins from the hyperthermophilic archaeabacteria Sulfolobussolfataricus and S. acidocaldarius, respectively. These proteins arelysine-rich and have high thermal, acid and chemical stability. Theybind DNA in a sequence-independent manner and when bound, increase theT, of DNA by up to 40° C. under some conditions (McAfee et al.,Biochemistry 34: 10063-10077, 1995). These proteins and their homologsare typically believed to be involved in stabilizing genomic DNA atelevated temperatures.

HMf-Like Proteins

The HMf-like proteins are archaeal histones that share homology both inamino acid sequences and in structure with eukaryotic H4 histones, whichare thought to interact directly with DNA. The HMf family of proteinsform stable dimers in solution, and several HMf homologs have beenidentified from thermostable species (e.g., Methanothermus fervidus andPyrococcus strain GB-3a). The HMf family of proteins, once joined to TaqDNA polymerase or any DNA modifying enzyme with a low intrinsicprocessivity, can enhance the ability of the enzyme to slide along theDNA substrate and thus increase its processivity. For example, thedimeric HMf-like protein can be covalently linked to the N terminus ofTaq DNA polymerase, e.g., via chemical modification, and thus improvethe processivity of the polymerase.

Those of skill in the art will recognize that other double-strandedsequence-non-specific nucleic acid binding domains are known in the artand can also be used as described herein.

Size Selection

Modified, accessible gDNA can optionally be enriched by size selection.gDNA fragments that were exposed to a cleaving agent can be fractionatedaccording to size using any method in the art, including, e.g., gelfiltration, electrophoresis, size exclusion, fractionation on a sucrosegradient or purification on a commercially available device such as thePippin Prep (Sage Science, on the internet at sagescience.com) or theLabChip XT (Caliper Life Sciences, on the internet at caliperls.com).Accessible chromatin regions will be relatively smaller in size;inaccessible chromatin regions will be relatively larger. The relativelylarger gDNA fragments representative of inaccessible gDNA or therelatively smaller gDNA fragments representive of accessible gDNA areused for subsequent DNA methylation determination. For example, in someembodiments, the DNA is selected for fragments larger than 100, 500, or1000 base pairs or other sizes, including but not limited to, 500-1000or -2000 or -3000 or -8000 base pairs.

c. Isolating or Purifying Genomic DNA

Following digestion or modification of the desired subpopulation ofgenomic DNA, the genomic DNA (enriched and not enriched) is isolatedfrom the cells according to any method available. Essentially any DNApurification procedure can be used so long as it results in DNA ofacceptable purity for the subsequent methylation detection andquantification step(s). For example, standard cell lysis reagents can beused to lyse cells. Optionally a protease (including but not limited toproteinase K) can be used. DNA can be isolated from the mixture as isknown in the art. In some embodiments, phenol/chloroform extractions areused and the DNA can be subsequently precipitated (e.g., by ethanol) andpurified. In some embodiments, RNA is removed or degraded (e.g., with anRNase or with use of a DNA purification column), if desired.

4. Determining DNA Methylation Status

a. Types of DNA Methylation

DNA methylation usually refers to 5-methylcytosine (5-mC) as themethylated base. However, other examples of methylated DNA bases,including 5-hydroxylmethylcytosine (5-hmC),glucosyl-5-hydroxylmethylcytosine (5-ghmC), 4-methylcytosine (4-mC) and6-methyladenine (6-mA) exist and are included as forms of DNAmethylation.

Analysis of the extent, quality and/or patterning of methylation of gDNAin a sample can be determined genome-wide, e.g., for the entirety of thegDNA in the sample (e.g., the total gDNA or subpopulation of gDNA), orat one or more preselected target gDNA regions.

b. Target DNA Regions

A gDNA region is a target sequence of interest within genomic DNA. AnygDNA sequence in genomic DNA of a cell can be evaluated for gDNAmethylation status. gDNA regions can be screened to identify a DNAregion of interest that displays different accessibility in differentcell types, between untreated cells and cells exposed to a drug,chemical or environmental stimulus, or between normal and diseasedtissue, for example. Thus, in some embodiments, the methods of theinvention are used to identify a DNA region whose change inaccessibility acts as a marker for disease (or lack thereof). Exemplarydiseases include but are not limited to cancers. A number of genes havebeen described that have altered DNA methylation and/or chromatinstructure in cancer cells compared to non-cancer cells. In someembodiments, the DNA region is known to be differentially accessibledepending on the disease or developmental state of a particular cell.

A variety of DNA regions can be detected either for research purposesand/or as a control DNA region to confirm that the reagents perform asexpected. For example, in some embodiments, a DNA region is assayed thatis accessible in essentially all cells of an animal. Such DNA regionsare useful, for example, as positive controls for accessibility. SuchDNA regions can be found, for example, within or adjacent to genes thatare constitutive or nearly constitutive. Such genes include thosegenerally referred to as “housekeeping” genes, i.e., genes whoseexpression are required to maintain basic cellular function. Examples ofsuch genes include, but are not limited to glyceraldehyde-3-phosphatedehydrogenase (GAPDH) and beta actin (ACTB). DNA regions can include allor a portion of such genes, optionally including at least a portion ofthe promoter.

In some embodiments, a DNA region comprises at least a portion of DNAthat is inaccessible in most cells of an animal. Such DNA regions areuseful, for example, as negative controls for accessibility or positivecontrols for inaccessibility. “Inaccessible” in this context refers toDNA regions whose copies are modified in no more than around 20% of thecopies of the DNA region. Examples of such gene sequences include thosegenerally recognized as “heterochromatic” and include genes that areonly expressed in very specific cell types (e.g., expressed in a tissueor organ-specific fashion). Exemplary genes that are generallyinaccessible (with the exception of specific cell types) include, butare not limited to, hemoglobin-beta chain (HBB) and immunoglobulin lightchain kappa (IGK).

In some embodiments, the DNA region is a gene sequence which hasdifferent accessibility depending on the disease state of the cell orotherwise have variable accessibility depending on type of cells orgrowth environment. For example, some genes are generally inaccessiblein non-cancer cells but are accessible in cancer cells. Examples ofgenes with variable accessibility include, e.g.,Glutathione-s-transferase pi (GSTP1).

In some embodiments, a DNA region of the invention is selected from agene sequence (e.g., a promoter sequence) from one or more of thefollowing genes cadherin 1 type 1 (E-Cadherin), Cytochrome P450-1A1(CYP1A1), Ras association domain family 1A (RASSF1A), p15, p16, Deathassociated protein kinase 1 (DAPK), Adenomatous Polyposis Of The Colon(APC), Methylguanine-DNA Methyltransferase (MGMT), Breast Cancer 1 Gene(BRCA1) and hMLH.

In some embodiments, the DNA regions are selected at random, forexample, to identify regions that have differential accessibilitybetween different cell types, different conditions, normal vs. diseasedcells, etc.

Detection of methylation status can be performed using any method knownin the art. General methods for methylation detection include withoutlimitation use of restriction enzymes with activity governed bymethylation status, use of antibodies specific for methylated nucleotidebases, polynucleotide sequencing and chemical modification of methylatednucleotide bases, e.g., such as bisulfite treatment.

c. Restriction-Enzyme Analysis

Restriction enzymes that recognize or are sensitive to methylation finduse to assess the presence or absence DNA methylation in genomic DNA.One can use a methylation-sensing restriction enzyme or othermethylation sensing agent to cleave DNA in either amethylation-dependent or methylation-sensitive manner. Exemplarymethylation-sensitive restriction enzymes (i.e., enzymes that cut DNA ifmethylation is absent) include, e.g., cytosine-methylation sensitiverestriction enzymes and adenosine-methylation sensitive restrictionenzymes. Exemplary methylation-sensitive restriction enzymes (i.e.,enzymes that cut DNA if methylation is absent) include, e.g.,cytosine-methylation sensitive restriction enzymes (e.g., AatII, AciI,AclI, AgeI, AluI, AscI, AseI, AsiSI, BbeI, BsaAI, BsaHI, BsiEI, BsiWI,BsrFI, BssHII, BssKI, BstBI, BstNI, BstUI, ClaI, EaeI, EagI, FauI, FseI,HhaI, HinPlI, HinCII, HpaII, Hpy99I, HpyCH4IV, KasI, MboI, MluI, MapAlI,MspI, NaeI, NarI, No I, PmlI, PstI, Pv I, RsrII, SacII, SapI, Sau3AI,SflI, SfoI, SgrAI, Smal, SnaBI, TscI, XmaI, and ZraI.) andadenosine-methylation sensitive restriction enzymes (e.g., DpnII).Exemplary methylation-dependent restriction enzymes (i.e., enzymes thatcut DNA if methylation is present) include, e.g., cytosine-methylationdependent restriction enzymes (e.g., McrBC, GlaI and BlsI) andadenosine-methylation dependent restriction enzymes (e.g., DpnI).Analysis with DNA methylation sensing restriction enzymes is described,e.g., in Holemon et al, Bio Techniques, (2007) 43:683-693).

DNA methylation usually refers to 5-methylcytosine (5-mC) as themethylated base. However, other examples of methylated DNA bases exist,including without limitation 5-hydroxylmethylcytosine (5-hmC),glucosyl-5-hydroxylmethylcytosine (5-ghmC), 4-methylcytosine (4-mC) and6-methyladenine (6-mA) and are contemplated herein as forms of DNAmethylation. 5-hmC and 5-ghmC can be analyzed by treatment of the gDNAin the biological samples with a glucosyltransferase, e.g., T4 phageβ-glucosyltransferase.

Kits for methylation detection using enzymatic analysis are commerciallyavailable, e.g., from SABiosciences (sabiosciences.com), New EnglandBiolabs (neb.com) and Zymo Research (zymoresearch.com). Assays thatdetect alternate methylated bases include the EpiMark 5-hmC and 5-mCAnalysis Kit (New England Biolabs), the Quest 5-hmC Detection Kit (ZymoResearch) and the restriction enzyme DpnI (New England Biolabs) whichonly digests DNA that contains 6-mA.

d. Affinity-Based Analysis

Antibodies that recognize or are sensitive to methylated bases areavailable and find use to assess the presence or absence DNA methylationin genomic DNA. Methylated DNA immunoprecipitation (MeDIP or mDIP) is achromosome- or genome-wide technique that can be used to enrich formethylated DNA sequences. Methylated DNA fragments of about 300-1000base pairs (bp) in length can be isolated via an antibody raised against5-methylcytosine (5mC). See, Weber, et al., Nat. Genet. (2007) 37 (8):853-62. The purified fraction of methylated DNA can be input tohigh-throughput DNA detection methods such as high-resolution DNAmicroarrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq). See,Down, et al., Nat. Biotechnol. (2008) 26 (7): 779-85 and Jacinto, etal., BioTechniques (2008) 44 (1): 35-39. MeDIP assays, in combinationwith hybridization on high-resolution microarrays or high-throughputsequencing (HTS) techniques, find use for identifying methylatedCpG-rich sequences. Antibodies against 5-methyl cytidine arecommercially available, e.g., from Eurogentec (eurogentec.com), Abcam(abcam.com), and Diagenode, and find use to immunoprecipitate methylatedDNA fragments.

In various embodiments, methyl-CpG binding domain proteins (MBD) areused in methylation analysis. For example, histidine-tagged MBD2b/MBD3L1protein complexes are available in kits for enrichment of methylatedgDNA (e.g., MethylCollector™ Ultra kit from Active Motif). The MBDprotein may also be conjugated to a detectable label, e.g., afluorophore or a fluorescing protein, for direct detection ofmethylation. See, e.g., Yu, et al., Anal Chem. (2010) 82(12):5012-9.Antibodies against MBD proteins may also be used to immunoprecipitateregions of gDNA bound by the MBD protein. Antibodies against MBD1, MBD2,MBD3, MBD4, MBD5, MBD6, MBD7 and other MBD variants and isoforms finduse and are known in the art. For example, monoclonal antibodies againstMBD1 are available from, e.g., Active Motif (activemotif.com), Millipore(millepore.com), Aviva Systems Biology (avivasysbio.com) and CaymanChemical (caymanchem.com). Antibodies against MBD2 are available from,e.g., Cayman Chemical, USCN Life Science (uscnk.com), Abcam (abcam.com)and Epigentek (epigentek.com). Antibodies against MBD3 are availablefrom, e.g., Active Motif, Abcam, Cell Signaling Technology(cellsignal.com) and Abgent (abgent.com). Antibodies against MBD4 areavailable from, e.g., Active Motif, Abcam, Abnova Corp. (abnova.com),Sigma-Aldrich (sigmaaldrich.com), Santa Cruz Biotechnology (scbt.com)and Diagenode (diagenode.com). Antibodies against MBD5 are availablefrom, e.g., Antibodies-online (antibodies-online.com), Abcam, Santa CruzBiotechnology and Novus Biologicals (novusbio.com). Antibodies againstMBD6 are available from, e.g., Sigma Aldrich and Abcam.

Antibodies against 5-methyl cytidine or one or more MBD protein variantscan be used alone or in combination. Generally, genomic DNA is extractedfrom the cells and purified using any method in the art. The purifiedDNA is cleaved into smaller fragments, using any appropriate methods,including, e.g., enzymatic cleavage or mechanical shearing, e.g.,sonication. The resulting fragments preferably are in the range of from300 to 1000 base pairs (bp) in length. In various embodiments, the DNAfragments can optionally be denatured to produce single-stranded DNA.The DNA is then incubated with antibodies against 5-methyl cytidine orone or more MBD protein variants. Immunoprecipitation techniques knownin the art are then applied to enrich for DNA fragments containingtarget antigen (i.e., 5-methyl cytidine and/or one or more MBD proteinvariants), and separate from unbound DNA washed away with thesupernatant. DNA can be removed from the bound antibody using any methodknown in the art. In various embodiments, a protease can be used torelease the bound DNA. For example, proteinase K finds use to digest theantibodies and release the DNA. The immunoprecipitated and released DNAcan then be collected and prepared for quantification of the extent,quality and/or patterning of methylation, as described herein. See,e.g., Weber, et al., supra; Pomraning et al., Methods (2009) 47 (3):142-50; Wilson et al., Cell Cycle (2005) 5: 155-8; and Zhang, et al.Cell (2006) 126 (6): 1189-201.

e. Polynucleotide Sequencing

A variety of methods can be used to determine the nucleotide sequenceand the extent to which sequenced nucleotides are modified, e.g.,methylated. Any sequencing method known in the art can be used so longas it can simultaneously determine the nucleotide sequence and whethersequenced nucleotides are modified, e.g., methylated. As used herein,“simultaneously” means that as the sequencing process determines theorder of nucleotides in a nucleic acid fragment, at the same time it canalso distinguish between modified nucleotides (e.g., methylatednucleotides) and nonmodified nucleotides (e.g., non-methylatednucleotides). Examples of sequencing processes that can simultaneousdetect nucleotide sequence and distinguish whether sequenced nucleotidesare modified include, but are not limited to, single-molecule real-time(SMRT) sequencing and nanopore sequencing.

In some embodiments, nucleotide sequencing comprises template-dependentreplication of the DNA region that results in incorporation of labelednucleotides (e.g., fluorescently labeled nucleotides), and wherein anarrival time and/or duration of an interval between signal generatedfrom different incorporated nucleotides is determinative of the presenceor absence of the modification and/or the identity of an incorporatednucleotide.

i. Single-Molecule Real-Time (“SMRT”) Sequencing

In some embodiments, genomic DNA comprising a target DNA region issequenced by single-molecule, real-time (SMRT) sequencing. SMRTsequencing is a process by which single DNA polymerase molecules areobserved in real time while they catalyze the incorporation offluorescently labeled nucleotides complementary to a template nucleicacid strand. Methods of SMRT sequencing are known in the art and wereinitially described by Flusberg et al., Nature Methods, 7:461-465(2010), which is incorporated herein by reference for all purposes.

Briefly, in SMRT sequencing, incorporation of a nucleotide is detectedas a pulse of fluorescence whose color identifies that nucleotide. Thepulse ends when the fluorophore, which is linked to the nucleotide'sterminal phosphate, is cleaved by the polymerase before the polymerasetranslocates to the next base in the DNA template. Fluorescence pulsesare characterized by emission spectra as well as by the duration of thepulse (“pulse width”) and the interval between successive pulses(“interpulse duration” or “IPD”). Pulse width is a function of allkinetic steps after nucleotide binding and up to fluorophore release,and IPD is a function of the kinetics of nucleotide binding andpolymerase translocation. Thus, DNA polymerase kinetics can be monitoredby measuring the fluorescence pulses in SMRT sequencing.

In addition to measuring differences in fluorescence pulsecharacteristics for each fluorescently-labeled nucleotide (i.e.,adenine, guanine, thymine, and cytosine), differences can also bemeasured for non-methylated versus methylated bases. For example, thepresence of a methylated base alters the IPD of the methylated base ascompared to its non-methylated counterpart (e.g., methylated cytosine oradenine as compared to non-methylated cytosine or adenine).Additionally, the presence of a methylated base alters the pulse widthof the methylated base as compared to its non-methylated counterpart(e.g., methylated cytosine or adenine as compared to nonmethylatedcytosine or adenine) and furthermore, different modifications havedifferent pulse widths (e.g., 5-hydroxymethylcytosine has a morepronounced excursion than 5-methylcytosine). Thus, each type ofnon-modified base and modified base has a unique signature based on itscombination of IPD and pulse width in a given context. The sensitivityof SMRT sequencing can be further enhanced by optimizing solutionconditions, polymerase mutations and algorithmic approaches that takeadvantage of the nucleotides' kinetic signatures, and deconvolutiontechiques to help resolve neighboring methylcytosine bases.

ii. Nanopore Sequencing

In some embodiments, nucleotide sequencing does not comprisetemplate-dependent replication of a DNA region. In some embodiments,genomic DNA comprising a target DNA region is sequenced by nanoporesequencing. Nanopore sequencing is a process by which a polynucleotideor nucleic acid fragment is passed through a pore (such as a proteinpore) under an applied potential while recording modulations of theionic current passing through the pore. Methods of nanopore sequencingare known in the art; see, e.g., Clarke et al., Nature Nanotechnology4:265-270 (2009), which is incorporated herein by reference for allpurposes.

Briefly, in nanopore sequencing, as a single-stranded DNA moleculepasses through a protein pore, each base is registered, in sequence, bya characteristic decrease in current amplitude which results from theextent to which each base blocks the pore. An individual nucleobase canbe identified on a static strand, and by sufficiently slowing the rateof speed of the DNA translocation (e.g., through the use of enzymes) orimproving the rate of DNA capture by the pore (e.g., by mutating keyresidues within the protein pore), an individual nucleobase can also beidentified while moving.

In some embodiments, nanopore sequencing comprises the use of anexonuclease to liberate individual nucleotides from a strand of DNA,wherein the bases are identified in order of release, and the use of anadaptor molecule that is covalently attached to the pore in order topermit continuous base detection as the DNA molecule moves through thepore. As the nucleotide passes through the pore, it is characterized bya signature residual current and a signature dwell time within theadapter, making it possible to discriminate between nonmethylated.Additionally, different dwell times are seen between methylatednucleotides and the corresponding non-methylated nucleotides (e.g.,5-methyl-dCMP has a longer dwell time than dCMP), thus making itpossible to simultaneously determine nucleotide sequence and whethersequenced nucleotides are modified. The sensitivity of nanoporesequencing can be further enhanced by optimizing salt concentrations,adjusting the applied potential, pH, and temperature, or mutating theexonuclease to vary its rate of processivity.

f. Bisulfite Modification

In some embodiments, bisulfite modification is used to determine theextent of methylation. Bisulfite modification is a preliminary step. Inbisulfite modification, the

DNA is contacted with bisulfite, thereby converting unmethylatedcytosines to uracils in the DNA. The methylation of a particular DNAregion can then be determined by any number of methylation detectionmethods. In some embodiments, a high resolution melt assay (HRM) can beemployed to detect methylation status following bisulfite conversion. Inthis method, a DNA region is amplified following bisulfite conversionand the resulting amplicon's melting temperature is determined. Becausethe melting temperature will differ depending on whether the cytosineswere converted by bisulfite (and subsequently copied as “T's” in theamplification reaction), melting temperature of the amplicon can becorrelated to methylation content. Bisulfite-based methods for detectingmethylation are described, e.g., in Kristensen et al., Clin. Chem.(2009) 55:471-1483. Any method known in the art can be used to assessDNA methylation of bisulfite modified DNA, including without limitation,e.g., MSP, bisulfite sequencing, heavy methyl COBRA. Applicabletechniques are described, e.g., in Laird, Nat Rev Cancer. (2003)3(4):253-66; Laird, Hum Mol Genet. (2005) 14 Spec No 1:R65-76; andCottrell and Laird, Ann N Y Acad Sci. (2003) 983:120-30.

Commercial bisulfite conversion kits are readily available and includethe MethylDetector™ Bisulfite Modification Kit from ActiveMotif(activemotif.com), the DNA Methylation Detection Kit from BioChain(biochain.com), Life Technologies' MethylCode Bisulfite Conversion Kit(lifetechnologies.com; appliedbiosystems.com), Millipore's CpGenome FastDNA Modification Kit (millepore.com), Qiagen's EpiTect Bisulfite Kits(qiagen.com), and Zymo Research's EZ DNA Methylation™ kits.

5. Chromatin Immunoprecipitation (ChIP) and Determining HistoneModification Status

In various embodiments, the methods further comprise assessing the DNAmethylation status of a subpopulation of gDNA that was originallyassociated with a particular histone type, histone modification or aparticular protein. This can be conveniently accomplished usingChromatin Immunoprecipitation (ChIP). ChIP can be used to investigatethe interaction between proteins and genomic DNA. ChIP finds use todetermine whether specific proteins are associated with specific genomicregions, such as transcription factors on promoters or other DNA bindingsites. ChIP also finds use to determine specific locations in the genomethat various histone modifications are associated with, indicating thetarget of the histone modifiers. See, e.g., Collas, et al., MolBiotechnol. (2010) 45(1):87-100; Acevedo, et al., Biotechniques (2007)43(6):791-7; Oberley, et al., Methods Enzymol (2004) 376:315-34; Birney,et al, Nature (2007) 447(7146):799-816; O'Geen, et al., BioTechniques(2006) 41(5). Detailed protocols for performing ChIP are availableonline, e.g., at farnham.genomecenter.ucdavis.edu/protocol.html, andfind use. Differences in histone modification or proteins associatedwith a DNA subpopulation relative to total DNA can have scientific,biological, clinical, physiological or pathological relevance. Theprocedure to determine such information varies according to the DNAsubpopulation analyzed.

Generally, in performing ChIP, protein with chromatin in a cell lysateis temporarily bonded, the DNA-protein complexes (chromatin-protein) arethen sheared and DNA fragments associated with the protein(s) ofinterest are selectively immunoprecipitated. In various embodiments, theDNA-protein complexes are reversibly cross-linked, e.g., using UV lightor formaldehyde, prior to immunoprecipitation. The cross-linkedchromatin can be sheared into fragments of about 300-1000 base pairs(bp) in length, e.g., using any appropriate methods, including, e.g.,enzymatic cleavage or mechanical shearing, e.g., sonication. In otherembodiments, the chromatin-protein complexes are not cross-linked.Instead, the chromatin is subject to micrococcal nuclease digestion,which cuts DNA at the length of the linker, leaving nucleosomes intact.Chromatin fragments of about 400-500 bp cover two to three nucleosomes.Protein-DNA complexes are selectively immunoprecipitated using knowntechniques and antibodies that specifically bind to the protein(s) ofinterest. The immunoprecipitated complexes are then collected and washedto remove non-specifically bound chromatin, the protein-DNA cross-linkis reversed, and proteins are released from the bound DNA, e.g., using aprotease. The immunoprecipitated DNA associated with the complex is thenpurified and the extent, quality and/or patterning of DNA methylationand DNA sequences are determined using any method, including thosedescribed herein.

Analysis of histone modifications and/or histone proteins can beperformed on gDNA that has been enriched for inaccessible or accessiblechromatin. In accordance with the present methods, ChIP allows forfurther enrichment of a subpopulation of gDNA (either inaccessible oraccessible) that is associated with a particular protein or histonemodification. The DNA methylation status of the subpopulation, e.g.,versus a control population (treated or untreated) or the total gDNApopulation.

Accordingly, various embodiments of the methods further include the stepof enriching for histones bearing modifications specifically recognizedby antibodies. Illustrative histone modifications of interest forenrichment include, e.g., Histone 3; lysine 4 mono, di and/or trimethylated Histone 3; lysine 9, mono, di and/or tri methylated Histone3; lysine 9, acetylated Histone 3; lysine 27, mono, di and/or trimethylated Histone 3; lysine 27, acetylated Histone 3; lysine 36, mono,di and/or tri methylated Histone 3; lysine 79, mono, di and/or trimethylated Histone 4; lysine 20, mono, di and/or tri methylatedacetylated Histone H3; and Acetylated Histone H4.

Numerous chromatin immunoprepitation kits are commercially available andfind use. Illustrative kits are available from, e.g., Sigma Aldrich(sigmaaldrich.com), Active Motif (activemotif.com), Millepore(millepore.com), Thermo Scientific (piercenet.com), R&D Systems(rndsystems.com), Imgenex (imgenex.com) and Epigentek (epigentek.com).

6. Quantifying the Extent, Quality and/or Patterning of Methylation

The invention provides for comparison with a reference the extent, orquality or patterning of methylation of a subpopulation of genomic DNA.The reference can be total genomic DNA or genomic DNA enriched for thesame subpopulation in a control. Depending on the test sample, thecontrol sample may be treated or untreated with an agent, e.g., apharmacological agent or drug. In other embodiments, depending on thetest sample, the control sample may be cancerous, pre-cancerous ornon-cancerous.

The determination of the extent, quality or patterning of methylationcan be genome-wide, e.g., with respect to the entirety of the genomicDNA in the sample being tested, or can be made with reference to aparticular region of genomic DNA. As needed or desired, thedetermination of the extent, quality or patterning of methylation can betarget specific (e.g., to one or more particular gDNA regions) ornon-target specific. In various embodiments, the extent, quality orpatterning of methylation of a first DNA region is compared with asecond DNA region in a cell's genome. Alternatively, or in addition, theextent, quality or patterning of methylation of the same DNA region iscompared in two different genomic DNA samples, e.g., the subpopulationand control reference. The genomic DNA samples can be from the same or adifferent cell population. For example, the two cells can representdiseased and healthy cells or tissues, different cell types, differentstages of development (including but not limited to stem cells orprogenitor cells), etc. Thus, by using the methods of the invention onecan detect differences in methylation extent, quality or patterningbetween cells and/or determine relative methylation characteristicsbetween two or more DNA regions (e.g., genes) within one cell. Inaddition, one can determine the effect of a drug, chemical orenvironmental stimulus on the chromatin structure/DNA methylation statusof a particular region in the same cells or in different cells.

In some embodiments, a difference comprises a 10%, 20%, 25%, 30%, 50%,75%, 100%, or greater, increase in the extent of methylation in thesubpopulation of gDNA in comparison to the gDNA control. In someembodiments, a difference comprises a 10%, 20%, 25%, 30%, 50%, 75%,100%, or greater, decrease in the extent of methylation in thesubpopulation of gDNA in comparison to the gDNA control. In someembodiments, the extent of methylation between the subpopulation gDNAand control gDNA will be approximately the same, but the quality and/orpatterning of the methylation will be detectably different.

The method for quantifying the extent, quality or patterning ofmethylation will depend on the enriched gDNA subpopulation and methodused for determining methylation. In some embodiments, methylation canbe detected and quantified using sequence techniques as described above.For example, all or a representative number of copies of sequences inthe sample can be sequenced thereby providing quantity and sequenceinformation for an enriched class of polynucleotides. In someembodiments, the sequencing can simultaneously determine methylation,also as described above.

In some embodiments, the enriched DNA is hybridized to one or morenucleic acids. In some embodiments, the nucleic acids are linked to asolid support, e.g., a microarray or beads. These embodiments are ofparticular use for genome-wide analyses as multiple enriched sequencescan be simultaneously hybridized to the microarray and hybridization cansubsequently be detected and quantified. See, e.g., Nimblegen™ SequenceCapture technology. In some of the embodiments described herein, nucleicacid adaptors are ligated or otherwise linked to the enriched DNA,thereby allowing for convenient amplification and/or sequencing of theenriched DNA.

In other embodiments, double stranded DNA cleavage events (e.g., asintroduced by a restriction enzyme or DNase or introduced followingmodification, e.g., by a methylation-sensitive or -dependent restrictionenzyme following methyltransferase treatment, or following modificationby a DNA modifying chemical as described herein) can be convenientlydetected using an amplification reaction designed to generate anamplicon that comprises a DNA region of interest. In the case ofcleavage events at defined sites, such as when a sequence-specificrestriction enzyme is used, primers are designed to generate an ampliconthat spans a potential cleavage site. Only intact DNA will be amplified.If one also knows the amount of total DNA, one can calculate the amountof cleaved DNA as the difference between total and intact DNA. The totalamount of DNA can be determined according to any method of DNAquantification known in the art. In some embodiments, the amount oftotal DNA can be conveniently determined by designing a set of primersthat amplify the DNA regardless of modification. This can be achieved,for example, by designing primers that do not span a potential cleavagesite, either within the same gene region or in another DNA region. Inthe case of cleavage events at indeterminate sites, such as when anon-sequence-specific nuclease, such as DNase I is used, the use of aninaccessible reference gene should be incorporated as an internalcontrol.

As discussed in more detail below, quantitative amplification(including, for example real-time PCR) methods allow for determinationof the amount of intact copies of a DNA region, and when used withvarious controls can be used to determine the relative amount of intactDNA compared to the total number of copies in the cell. The actual orrelative number (e.g., relative to the total number of copies orrelative to the number of modified or cleaved or unmodified or uncleavedcopies of a second DNA region) of modified or unmodified copies of theDNA region can thus be calculated.

In some embodiments of the invention, the number of modified copies of aDNA region are determined directly following enrichment for cleaved oruncleaved DNA. For example, restriction enzyme cleavage can be detectedand quantified, for example, by detecting specific ligation events, forexample, that will occur only in the presence of specific sticky orblunt ends. For example, nucleic acid adaptors comprising sticky endsthat are complementary to sticky ends generated by a restriction enzymecan be ligated to the cleaved genomic DNA. The number of ligation eventscan then be detected and quantified (e.g., by a quantitativeamplification method).

In some embodiments, ligation mediated PCR (LM-PCR) is employed toquantify the number of cleaved copies of a DNA region. Methods of LM-PCRare known in the art and were initially described in Pfeifer et al.,Science 246: 810-813 (1989). LM-PCR can be performed in real-time forquantitative results if desired.

Quantitative amplification methods (e.g., quantitative PCR orquantitative linear amplification) involve amplification of an nucleicacid template, directly or indirectly (e.g., determining a Ct value)determining the amount of amplified DNA, and then calculating the amountof initial template based on the number of cycles of the amplification.Amplification of a DNA locus using reactions is well known (see U.S.Pat. Nos. 4,683,195 and 4,683,202; PCR PROTOCOLS: A GUIDE TO METHODS ANDAPPLICATIONS (Innis et al., eds, 1990)). Typically, PCR is used toamplify DNA templates. However, alternative methods of amplificationhave been described and can also be employed, as long as the alternativemethods amplify intact DNA to a greater extent than the methods amplifycleaved DNA. Methods of quantitative amplification are disclosed in,e.g., U.S. Pat. Nos. 6,180,349; 6,033,854; and 5,972,602, as well as in,e.g., Gibson et al., Genome Research 6:995-1001 (1996); DeGraves, etal., Biotechniques 34(1):106-10, 112-5 (2003); Deiman B, et al., MolBiotechnol. 20(2):163-79 (2002). Amplifications can be monitored in“real time.”

In some embodiments, quantitative amplification is based on themonitoring of the signal (e.g., fluorescence of a probe) representingcopies of the template in cycles of an amplification (e.g., PCR)reaction. In the initial cycles of the PCR, a very low signal isobserved because the quantity of the amplicon formed does not support ameasurable signal output from the assay. After the initial cycles, asthe amount of formed amplicon increases, the signal intensity increasesto a measurable level and reaches a plateau in later cycles when the PCRenters into a non-logarithmic phase. Through a plot of the signalintensity versus the cycle number, the specific cycle at which ameasurable signal is obtained from the PCR reaction can be deduced andused to back-calculate the quantity of the target before the start ofthe PCR. The number of the specific cycles that is determined by thismethod is typically referred to as the cycle threshold (Ct).Illustrative methods are described in, e.g., Heid et al. Genome Methods6:986-94 (1996) with reference to hydrolysis probes.

One method for detection of amplification products is the 5′-3′exonuclease “hydrolysis” PCR assay (also referred to as the TaqMan™assay) (U.S. Pat. Nos. 5,210,015 and 5,487,972; Holland et al., ProcNatl Acad Sci USA 88: 7276-7280 (1991); Lee et al., Nucleic Acids Res.21: 3761-3766 (1993)). This assay detects the accumulation of a specificPCR product by hybridization and cleavage of a doubly labeledfluorogenic probe (the “TaqMan™ probe) during the amplificationreaction. The fluorogenic probe consists of an oligonucleotide labeledwith both a fluorescent reporter dye and a quencher dye. During PCR,this probe is cleaved by the 5′-exonuclease activity of DNA polymeraseif, and only if, it hybridizes to the segment being amplified. Cleavageof the probe generates an increase in the fluorescence intensity of thereporter dye.

Another method of detecting amplification products that relies on theuse of energy transfer is the “beacon probe” method described by Tyagiand Kramer, Nature Biotech. 14:303-309 (1996), which is also the subjectof U.S. Pat. Nos. 5,119,801 and 5,312,728. This method employsoligonucleotide hybridization probes that can form hairpin structures.On one end of the hybridization probe (either the 5′ or 3′ end), thereis a donor fluorophore, and on the other end, an acceptor moiety. In thecase of the Tyagi and Kramer method, this acceptor moiety is a quencher,that is, the acceptor absorbs energy released by the donor, but thendoes not itself fluoresce. Thus, when the beacon is in the openconformation, the fluorescence of the donor fluorophore is detectable,whereas when the beacon is in hairpin (closed) conformation, thefluorescence of the donor fluorophore is quenched. When employed in PCR,the molecular beacon probe, which hybridizes to one of the strands ofthe PCR product, is in the open conformation and fluorescence isdetected, while those that remain unhybridized will not fluoresce (Tyagiand Kramer, Nature Biotechnol. 14: 303-306 (1996)). As a result, theamount of fluorescence will increase as the amount of PCR productincreases, and thus may be used as a measure of the progress of the PCR.Those of skill in the art will recognize that other methods ofquantitative amplification are also available.

Various other techniques for performing quantitative amplification of anucleic acids are also known. For example, some methodologies employ oneor more probe oligonucleotides that are structured such that a change influorescence is generated when the oligonucleotide(s) is hybridized to atarget nucleic acid. For example, one such method involves is a dualfluorophore approach that exploits fluorescence resonance energytransfer (FRET), e.g., LightCycler™ hybridization probes, where twooligo probes anneal to the amplicon. The oligonucleotides are designedto hybridize in a head-to-tail orientation with the fluorophoresseparated at a distance that is compatible with efficient energytransfer. Other examples of labeled oligonucleotides that are structuredto emit a signal when bound to a nucleic acid or incorporated into anextension product include: Scorpions™ probes (e.g., Whitcombe et al.,Nature Biotechnology 17:804-807, 1999, and U.S. Pat. No. 6,326,145),Sunrise™ (or Amplifluor™) probes (e.g., Nazarenko et al., Nuc. AcidsRes. 25:2516-2521, 1997, and U.S. Pat. No. 6,117,635), and probes thatform a secondary structure that results in reduced signal without aquencher and that emits increased signal when hybridized to a target(e.g., Lux probes™).

In other embodiments, intercalating agents that produce a signal whenintercalated in double stranded DNA may be used. Exemplary agentsinclude SYBR GREEN™, SYBR GOLD™, and EVAGREEN™. Since these agents arenot template-specific, it is assumed that the signal is generated basedon template-specific amplification. This can be confirmed by monitoringsignal as a function of temperature because melting point of templatesequences will generally be much higher than, for example,primer-dimers, etc.

In some embodiments, the quantity of a DNA region is determined bynucleotide sequencing copies in a sample and then determining therelative or absolute number of copies having the same sequence in asample.

Quantification of cleaved or modified (or unmodified or uncleaved) DNAregions according to the method of the invention can be furtherimproved, in some embodiments, by determining the relative amount (e.g.,a normalized value such as a ratio or percentage) of cleaved or modifiedor unmodified or uncleaved copies of the DNA region compared to thetotal number of copies of that same region. In some embodiments, therelative amount of cleaved or modified or unmodified or uncleaved copiesof one DNA region is compared to the number of cleaved or modified orunmodified or uncleaved copies of a second (or more) DNA regions. Insome embodiments, when comparing between two or more DNA regions, therelative amount of cleaved or modified or unmodified or uncleaved copiesof each DNA region can be first normalized to the total number of copiesof the DNA region. Alternatively, when obtained from the same sample, insome embodiments, one can assume that the total number of copies of eachDNA region is roughly the same and therefore, when comparing between twoor more DNA regions, the relative amount (e.g., the ratio or percentage)of cleaved or modified or unmodified or uncleaved copies between eachDNA region is determined without first normalizing each value to thetotal number of copies.

In some embodiments, the actual or relative (e.g., relative to totalDNA) amount of cleaved or modified or unmodified or uncleaved copies iscompared to a control value. Control values can be conveniently used,for example, where one wants to know whether the accessibility of aparticular DNA region exceeds or is under a particular value. Forexample, in the situation where a particular DNA region is typicallyaccessible in normal cells, but is inaccessible in diseased cells (orvice versa), one may simply compare the actual or relative number ofcleaved or modified or unmodified or uncleaved copies to a control value(e.g., greater or less than 20% modified or unmodified, greater or lessthan 80% modified or unmodified, etc.). Alternatively, a control valuecan represent past or expected data regarding a control DNA region. Inthese cases, the actual or relative amount of a control DNA region aredetermined (optionally for a number of times) and the resulting data isused to generate a control value that can be compared with actual orrelative number of cleaved or modified or unmodified or uncleaved copiesdetermined for a DNA region of interest.

The calculations for the methods described herein can involvecomputer-based calculations and tools. The tools are advantageouslyprovided in the form of computer programs that are executable by ageneral purpose computer system (referred to herein as a “hostcomputer”) of conventional design. The host computer may be configuredwith many different hardware components and can be made in manydimensions and styles (e.g., desktop PC, laptop, tablet PC, handheldcomputer, server, workstation, mainframe). Standard components, such asmonitors, keyboards, disk drives, CD and/or DVD drives, and the like,may be included. Where the host computer is attached to a network, theconnections may be provided via any suitable transport media (e.g.,wired, optical, and/or wireless media) and any suitable communicationprotocol (e.g., TCPIIP); the host computer may include suitablenetworking hardware (e.g., modem, Ethernet card, WiFi card). The hostcomputer may implement any of a variety of operating systems, includingUNIX, Linux, Microsoft Windows, MacOS, or any other operating system.

Computer code for implementing aspects of the present invention may bewritten in a variety of languages, including PERL, C, C++, Java,JavaScript, VBScript, AWK, or any other scripting or programminglanguage that can be executed on the host computer or that can becompiled to execute on the host computer. Code may also be written ordistributed in low level languages such as assembler languages ormachine languages.

The host computer system advantageously provides an interface via whichthe user controls operation of the tools. In the examples describedherein, software tools are implemented as scripts (e.g., using PERL),execution of which can be initiated by a user from a standard commandline interface of an operating system such as Linux or UNIX. Thoseskilled in the art will appreciate that commands can be adapted to theoperating.system as appropriate. In other embodiments, a graphical userinterface may be provided, allowing the user to control operations usinga pointing device. Thus, the present invention is not limited to anyparticular user interface.

7. Reporting Diagnosis and/or Providing Therapy to Subject

The present methods find use as a diagnostic and/or prognostic tool.Once a diagnosis or prognosis is established using the present methods,a regimen of treatment can be established or an existing regimen oftreatment can be altered in view of the diagnosis or prognosis. Forinstance, detection of a cancer cell according to the methods of theinvention can lead to the administration of chemotherapeutic agentsand/or radiation to an individual from whom the cancer cell wasdetected.

Accordingly, in some embodiments, the methods further comprise the stepof providing a diagnosis to the patient, e.g., based on the informationobtained regarding biological, pathological, genetic, epigenetic, ordisease status based on information relating to extent, quality and/orpatterning of methylation in the subpopulation of DNA in comparison tothe control.

In some embodiments, the methods further comprise the step ofrecommending or providing a regimen of treatment to the patient, e.g.,based on the information obtained regarding biological, pathological,genetic, epigenetic, or disease status based on information relating toextent, quality and/or patterning of methylation in the subpopulation ofDNA in comparison to the control.

EXAMPLES

The following examples are offered to illustrate, but not to limit theclaimed invention.

Example 1 Epigenetic Changes in a Subpopulation of Cells

The following example illustrates use of the present methods to assess abiopsy. The purpose of the experiment is to determine if a tissue biopsysample contains a subpopulation of cells that exhibits aberrantepigenetic regulation of a specific genetic biomarker. Such results canindicate the presence of a malignancy in a background of healthy cells.

For the purposes of this example, the genetic biomarker may be a gDNAregion corresponding to the promoter of a tumor suppressor gene.Aberrant epigenetic regulation of the genetic biomarker is detected ifthe DNA is methylated and is in a closed chromatin configuration. Suchinformation indicates that the expression of the genetic biomarker hasbeen inappropriately silenced, which is often associated with malignantcells.

The biopsy is first treated under a condition that dissociates thebiopsy sample into a single-cell suspension. A portion of the biopsysample is then treated with an agent that permeabilizes the cellmembrane and allows entry of a nuclease. The nuclease digests accessiblechromatin, but does not digest chromatin that is in an inaccessibleconformation. A second portion of the biopsy is treated similarly to thefirst treatment, but no nuclease is added. It thus serves as a nonuclease control. Total gDNA is then isolated from both biopsytreatments, and a portion of the gDNA samples are treated with amethylation-dependent restriction enzyme (MDRE) that digests methylatedDNA only.

The samples are then assessed by real-time PCR using primer sets thatamplify the genetic biomarker. A list of the samples is below:

Sample # Nuclease treated? MDRE treated? 1 No No 2 No Yes 3 Yes No 4 YesYes

The extent of methylation of the genetic biomarker in total gDNA can beestimated by comparing samples 1 and 2. Similarly, the extent ofmethylation of the genetic biomarker in inaccessible chromatin can beestimated by comparing samples 3 and 4. If the extent of methylation ofthe genetic biomarker is significantly higher in inaccessible chromatinthan in total gDNA, the results imply that the genetic biomarker isinappropriately silenced in a portion of the biopsy suggestive of thepresence of a malignancy.

Example 2 Detection of Epigenetic Changes in Inaccessible gDNA

DNA methylation is an epigenetic modification that inactivates genes bycompacting the DNA and rendering it inaccessible to transcriptionfactors and the transcriptional machinery. In most human cancers,specific tumor suppressor genes are silenced by aberrant DNAmethylation. Such silencing is associated with a change in chromatinconformation from an open, accessible structure to a closed,inaccessible conformation. The present example illustrates thedetermination of a difference in the DNA methylation state of a tumorsuppressor promoter in a subpopulation of inaccessible DNA, relative tothe total DNA population in a mixture of DNA derived from cancerous andnon-cancerous cells. This example demonstrates the use of the presentmethods to identify a small number of cancerous cells in a largerbackground of non-cancerous cells (e.g., as in a tissue biopsy).

The human glutathione S-transferase pi 1 (“GSTP1”) gene promoter wasanalyzed. In non-cancerous prostate cells, GSTP1 is highly expressed,its promoter is not methylated and it is an accessible chromatinconformation. In contrast, in cancerous prostate cells GSTP1 is notexpressed, its promoter is highly methylated and it is in aninaccessible chromatin configuration (Okino et al, Mol. Carcinog. (2007)46:839-846). Two human cell lines were analyzed. HCT15 cells are derivedfrom colon tissue and express GSTP1 similar to non-cancerous prostatetissue. LNCaP cells are a prostate cancer cell line that retain thecancerous GSTP1 expression characteristics.

Methods

In situ nuclease digestion was performed as described in co-pending U.S.application Ser. No. 12/618,076, the entire content of which isincorporated herein by reference in its entirety for all purposes.Briefly, cells were treated when they reached 90% confluence. Theculture media was aspirated and a permeabilization/digestion buffer wasgently layered on the cells. For cells treated with MnII thepermeabilization/digestion buffer contained lysolecithin, NaCl,Tris-HCl, MgCl₂, DTT, BSA and MnlI. For cells treated with DNase I, thepermeabilization/digestion buffer contained lysolecithin, Tris-HCl,MgCl₂, CaCl₂ and DNase I. The permeabilized cells were then incubated at37° C. for 1 hour. Following incubation, lysis/stop solution (100 mMTris-HCl pH 7.4, 100 mM NaCl, 100 mM EDTA, 5% N-lauroylsarcosine (w/v),80 μg/ml RNase A and 3 mg/ml proteinase K) was added to thepermeabilization/digestion buffer and the cell lysates were incubated at37° C. for 10 minutes.

Genomic DNA was isolated from cultured cells by standard procedures.Completely methylated human DNA and completely unmethylated human DNAwere purchased from Qiagen and used as controls. The DNA samples wereeither mock digested or digested with McrBC, HhaI or a combination ofboth enzymes as described, e.g., in Holemon et al, BioTechniques, (2007)43:683-693). McrBC is a methylation-dependent nuclease that only digestsmethylated DNA; HhaI is a methylation-sensitive restriction enzyme thatonly digests DNA that is not methylated. Following enzyme treatment 5 ngof each DNA sample was amplified by real-time PCR using primers specificfor the human GSTP1 promoter. The ΔCt value comparing the mock digestedsample with the enzyme digested samples are reported. Undigested DNAresults in lower ΔCt values; conversely, digested DNA results in higherΔCt values.

Results Analysis of Total DNA

The completely methylated DNA sample was significantly digested withMcrBC, but was not digested with HhaI (ΔCt=5.0 and −0.5 respectively).In contrast, the completely unmethylated DNA sample was not digestedwith McrBC, but was digested with HhaI (ΔCt=−0.2 and 4.8 respectively).In addition, no significant further digestion is detected in either DNAsample after treatment with both enzymes. These results are as expectedand demonstrate that the present methods can readily distinguishmethylated and unmethylated DNA samples. Analysis of total DNA fromHCT15 cells revealed that it had a digestion profile similar tounmethylated DNA. In contrast, total DNA from LNCaP cells had adigestion profile consistent with methylated DNA (Table 1). Thesefindings are consistent with results published in Okino, et al., supra,and indicate that the GSTP1 promoter in LNCaP cells is extensivelymethylated whereas it has little or no methylation in HCT15 cells.

TABLE 1 ΔCt ΔCt ΔCt DNA Sample McrBC HhaI McrBC + Hha CompletelyMethylated DNA 5.0 −0.5 5.6 Completely Unmethylated DNA −0.2 4.8 4.6Total HCT15 DNA −0.3 9.5 8.4 Total LNCaP DNA 2.6 −0.3 3.1

Analysis of Mixed DNA Samples Comparing Total DNA and Inaccessible DNA

HCT15 and LNCaP cells were treated with a nuclease in situ to digestaccessible chromatin. The remaining DNA, which represents DNA that wasoriginally in an inaccessible chromatin conformation, was then purified.To simulate a biopsy sample containing a small amount of cancerous cellsin a background of non-cancerous cells, HCT15 and LNCaP DNA samplesisolated from untreated cells (representative of total DNA) and nucleasetreated cells (representative of inaccessible DNA) were mixed in eithera 90:10 ratio or a 97:3 ratio. The mixed DNA samples were digested withMcrBC and/or HhaI. The results (Table 2) demonstrate that the GSTP1promoter in the mixed samples of total DNA was digested by HhaI. Inaddition, a low level of McrBC digestion was detected because GSTP1digestion is more complete in the McrBC+HhaI samples than in the samplesdigested with HhaI alone. These findings indicate that the DNAmethylation assay correctly identifies that the GSTP1 promoter ispartially methylated in both DNA samples. In addition, because theextent of HhaI digestion is lower in the 90:10 sample than in the 97:3sample (ΔCt=2.5 and 4.4, respectively) the DNA methylation assay hascorrectly determined that the extent of GSTP1 promoter methylation ishigher in the 90:10 sample than in the 97:3 sample.

TABLE 2 ΔCt ΔCt ΔCt DNA Fraction HCT15 LNCaP McrBC HhaI McrBC + HhaTotal 90% 10% 0.0 2.5 6.1 Inaccessible 90% 10% 3.3 0.3 4.6 Total 97% 3%0.0 4.4 8.1 Inaccessible 97% 3% 1.6 0.6 3.4

The GSTP1 promoter methylation results are much different in the sampleswhere inaccessible DNA is assessed. In both the 90:10 and 97:3 samples,significant digestion with McrBC (ΔCt=3.3 and 1.6 respectively) andlittle digestion with HhaI (ΔCt=0.3 and 0.6 respectively) was observed.These findings reveal that the GSTP1 promoter in the inaccessible DNAsubpopulation is highly methylated and, thus, significantly differentthan the total DNA population. These results indicate that comparison oftarget gene methylation in total DNA and inaccessible DNA can revealdistinct DNA methylation differences in a small subpopulation, and showsthat cancerous cells at a 3% level can be detected in a tumor biopsy.

Example 3 Detection of Epigenetic Changes Between Two Alleles

Genomic imprinting is an epigenetic process that inactivates onechromosomal allele by DNA methylation and chromatin compaction. Theother allele is maintained in an accessible chromatin structure and istranscriptionally active resulting in monoallelic gene expression.Genomic imprinting occurs in a parent-of-origin specific fashion. Properimprinting is important for normal development, and imprinting defectsare associated with numerous genetic diseases and syndromes.

To detect imprinted genes, cells are split into two portions. The firstportion is treated with a nuclease in situ to digest accessiblechromatin. The second portion is not treated with a nuclease. gDNAisolated from the first portion of cells represents the subpopulation ofgDNA that is in an inaccessible chromatin structure. gDNA isolated fromthe second cell portion represents total gDNA.

Assessment of the two gDNA samples for DNA methylation at imprinted generegions can indicate important facets of the imprinting status. Forgenes that are properly imprinted it is expected that complete DNAmethylation of the candidate genomic region is observed in the gDNAsample representing inaccessible DNA. However, only partial (50%)candidate region DNA methylation is expected when the total gDNA sampleis analyzed. For samples that have imprinting defects, deviation fromthe expected DNA methylation profile is detected.

It is understood that the examples and embodiments described herein arefor illustrative purposes only and that various modifications or changesin light thereof will be suggested to persons skilled in the art and areto be included within the spirit and purview of this application andscope of the appended claims. All publications, patents, and patentapplications cited herein are hereby incorporated by reference in theirentirety for all purposes.

1. A method of detecting a biological, pathological, genetic orepigenetic state of a subpopulation of genomic DNA (gDNA) in a sample,the method comprising: a) dividing a biological sample comprising gDNAinto at least a first portion and a second portion; b) enriching asubpopulation of gDNA in the first portion; and c) determining the DNAmethylation status at one or more gDNA regions in the first portion andin the second portion, wherein a difference in the extent of DNAmethylation in the subpopulation of gDNA in the first portion relativeto the extent of DNA methylation in the second portion at the one ormore gDNA regions is correlated with the biological, pathological,genetic or epigenetic state in the subpopulation of gDNA.
 2. The methodof claim 1, further comprising the step of obtaining the biologicalsample.
 3. The method of claim 1, wherein the biological sample is apopulation of cells.
 4. The method of claim 3, wherein the population ofcells is treated with a permeabilization agent and a DNA modificationagent prior to the enrichment step b).
 5. The method of claim 3, whereinthe population of cells is in situ.
 6. The method of claim 1, whereinthe biological sample is a solid tissue sample.
 7. The method of claim1, wherein the enriching step comprises enriching for accessiblechromatin.
 8. The method of claim 1, wherein the enriching stepcomprises enriching for inaccessible chromatin.
 9. The method of claim1, further comprising performing chromatin immunoprecipitation (ChIP).10. The method of claim 1, wherein the second portion comprises totalgDNA.
 11. The method of claim 1, wherein the second portion is from abiological sample that has been treated with a pharmacological agent.12. The method of claim 1, wherein the extent of DNA methylation statusis determined via methylation-sensing restriction enzyme analysis 13.The method of claim 1, wherein the extent of DNA methylation status isdetermined by contacting the gDNA with bisulfite and detectingmethylation of bisulfite-modified gDNA.
 14. The method of claim 1,wherein the extent of DNA methylation status is determined via affinitypurification.
 15. The method of claim 1, wherein the extent of DNAmethylation status is determined via direct nucleic acid sequencing. 16.The method of claim 1, wherein the extent of methylation at the one ormore gDNA regions in the first portion is higher than the extent ofmethylation at the one or more gDNA regions in the second portion. 17.The method of claim 1, wherein the extent of methylation at the one ormore gDNA regions in the first portion is lower than the extent ofmethylation at the one or more gDNA regions in the second portion.
 18. Amethod of detecting the presence of cancer in a biological sample, themethod comprising: a) dividing a biological sample comprising gDNA,wherein the biological sample comprises cells suspected of beingcancerous, into at least a first portion and a second portion; b)enriching a subpopulation of gDNA in the first portion; and c)determining the DNA methylation status at one or more gDNA regions inthe first portion and in the second portion, wherein a difference in theextent of DNA methylation in the enriched gDNA in the first portionrelative to the extent of DNA methylation in the gDNA in the secondportion at the one or more gDNA regions is correlated with the presenceof cancer in the biological sample. 19-39. (canceled)
 40. A method ofdetermining genomic imprinting of a preselected gDNA region in abiological sample, the method comprising: a) dividing a biologicalsample comprising gDNA into at least a first portion and a secondportion; b) enriching for inaccessible gDNA in the first portion andretaining total gDNA in the second portion; and c) determining the DNAmethylation status at the preselected gDNA region in the first portionand in the second portion, wherein an extent of DNA methylation in theinaccessible gDNA in the first portion that is about 100% and the extentof DNA methylation in the total gDNA in the second portion is about 50%at the preselected gDNA region is correlated with proper imprinting ofthe preselected gDNA region, and wherein an extent of DNA methylation inthe inaccessible gDNA in the first portion that is less than about 90%and the extent of DNA methylation in the total gDNA in the secondportion is about 50% at the preselected gDNA region is correlated withloss of imprinting of the preselected gDNA region.
 41. A method ofdetermining genomic imprinting of a preselected gDNA region in abiological sample, the method comprising: a) dividing a biologicalsample comprising gDNA into at least a first portion and a secondportion; b) enriching for accessible gDNA in the first portion andretaining total gDNA in the second portion; and c) determining the DNAmethylation status at the preselected gDNA region in the first portionand in the second portion, wherein an extent of DNA methylation in theaccessible gDNA in the first portion that is about 0% and the extent ofDNA methylation in the total gDNA in the second portion that is about50% at the preselected gDNA region is correlated with proper imprintingof the preselected gDNA region.
 42. A method of detecting a biological,pathological, genetic or epigenetic state of accessible gDNA in asample, the method comprising: a) dividing a biological samplecomprising gDNA into at least a first portion and a second portion; b)enriching for accessible gDNA in the first portion and retaining totalgDNA in the second portion; and c) determining the DNA methylationstatus at one or more gDNA regions in the first portion and in thesecond portion, wherein a difference in the extent of DNA methylation inthe subpopulation of gDNA in the first portion relative to the extent ofDNA methylation in the total gDNA in the second portion at the one ormore gDNA regions is correlated with the biological, pathological,genetic or epigenetic state in the subpopulation of gDNA.
 43. (canceled)